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DESCRIPTION 



Metabolic Selection Methods 



Field Of The Invention 

The present invention relates to methods for screening for 
enzymatic pathways, and the isolation of the genes and proteins 
that make up these pathways. 

Background Of The Invention 

The following description of the background of the invention 
is provided to aid in understanding the invention, but is not 
admitted to be, or to describe, prior art to the invention. 

Biological synthesis of compounds is ^reguently more cost 
effective and more productive than chemical synthesis, which can 
have low yields, require expensive and toxic reagents, and 
require lengthy purifications. In contrast, biological synthesis 
15 using known pathways can be rapid, with high yields. However, 
the identification of new biological pathways for syntheses of 
interest is difficult and time consuming. 

Currently, the biochemical screening of isolates is a major 
means by which people find new pathways for the production of 
chemicals, antibacterials, and other anti-inf ectives . However, 
screening is inherently several orders of magnitude slower than 
selection and requires that the organism be cultured in the 
laboratory. Since at least 99% of the microbes in the 
environment do not grow on laboratory media, less than 1% can be 
25 tested using a biochemical screen. Thus, biological pathways in 
99% of organisms will never be found by classical biochemical 
screening technologies. 

Summary Of The Invention 

The metabolic selection strategy of this invention is 
30 designed to find an enzymatic pathway for the conversion of any 



BNSDOCID: <WO_ 002217aA1 IA> 



wo 00/22170 



PCT/US99/23862 



source compound to any target compound. Conservatively, this 
technique allows at least a million-fold increase in the 
discovery rate over classical biochemical screening approaches, 
and allows testing of the 99% of the environmental microbes that 
5 are currently unable to be cultured in the laboratory. 

A biocatalytic or metabolic pathway consists of a series of 
protein catalysts (enzymes) which catalyze the conversion of a 
starting material to the final product, A general process to 
identify the metabolic pathway from a source compound to a target 
10 compound involves the creation/ identification of an easily 
genetically-manipulatable organism containing an inducible 
signal, which is activated when a target compound is metabolized. 
This is followed by the screening of nucleic acid in this 
organism to identify genes which metabolize the source compound 
15 to the target compound. 

An example of a selection strategy which can be used to 
identify the metabolic pathway from a source compound to a target 
compound is diagrammed in Figure 11. As a first step, microbial 
isolates are selected that are capable of metabolizing a target 
20 compound "T", but not a source compound "S", to an essential 
factor. Essential factors can include elements like carbon, 
sulfur, phosphorous, and nitrogen, or other essential nutrients, 
e.g. some amino acids, fatty acids, and carbohydrates- In a 
second step, the pathway responsible for the catabolism of 
25 compound "T" is identified and made conditional. That is, the 
gene(s) for the pathway is cloned and placed under control of an 
inducible promoter such that growth on the target compound is 
turned "ON" only when the inducer is present. This engineered 
strain is referred to as the ^^tester strain'\ The third part of 
30 the strategy is the transfer of foreign DNA from environmental 
sources into the tester strain, followed by selection for growth 
on the source compound "S" in the presence of inducer. Such 
positive clones either are capable of metabolizing compound "S" 
in the absence of inducer, in which case utilization of "S" does 
35 not require prior conversion to compound "T" (Figure 11; pathway 
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I) , or alternatively metabolize compound "S" only when "T" 
catabolism is "ON", suggesting that utilization of "S" proceeds 
via compound "T" to intermediary metabolism (Figure 11; pathway 

II) . These latter clones are further analyzed and the 
biocatalysts for the conversion of "S" to "T" are characterized, 

A specific embodiment of the metabolic selection strategy is 
shown in Figure 12, where "S" is 2-keto-L-gulonate (2-KLG) , and 
"T" is ascorbic acid (AsA) which can be metabolized to carbon and 
energy . 

Thus, in a first aspect, the invention features a method of 
screening for one or more nucleic acid sequences which express a 
product or products that convert a source compound into a target 
compound. The method comprises contacting a cell with one or 
more test nucleic acid sequences, where the cell expresses one or 
15 more genes encoding one or more proteins which, in the presence 
of the target compound, provide a detectable signal. The 
detectable signal indicates the presence of the desired nucleic 
acid sequence or sequences. 

The term "screening" as used herein refers to methods for 
20 identifying a nucleic acid sequence of interest. Preferably, the 
method permits the identification of a nucleic acid sequence of 
interest among one or more sequences, more preferably among 
hundreds (100, 200,... 900), most preferably among thousands 
(1,000, 2,000, .. .etc. ) or more. The sequences to be screened can 
25 be isolated from one or more organisms. Preferably, the 
sequences are isolated from hundreds of organisms, more 
preferably from thousands or more organisms. The term 
"screening" may include both classical screening, whereby 
expression of the nucleic acid results in a phenotype that can be 
30 identified. (for example by having a colony with the nucleic acid 
of interest change color, fluoresce, or luminesce), and may also 
include classical selection, where typically the phenotype to be 
identified is growth on selective media. By ^^selective" is meant 
media on which the host strain will not grow or grows poorly, but 
35 that strains with the nucleic acid of interest will grow in a 
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manner which can be readily distinguished from host strain growth 
by methods well-known in the art. 

The term "nucleic acid" as used herein refers to either 
deoxyribonucleic acid or ribonucleic acid that may be isolated, 
5 enriched, or purified from natural sources or synthesized 
recombinant ly. These methods are well-known in the art and 
specific examples are also given herein. Preferably, a "nucleic 
acid" to be identified in the screening method comprises a 
nucleic acid encoding a metabolic pathway that is not normally 
10 found in the cell. Thus, preferably, the pathway has not simply 
been inactivated through a mutation and the relevant genes are 
now being identified through complementation. ^ Rather the nucleic 
acid being identified does not normally exist in the cell in 
which it is being screened for. Typically, the screening is 
15 cross strains, more typically, cross-species, and even more 
preferably, cross-genera or with further remoteness. 

By "isolated, purified, or enriched" in reference to nucleic 
acid is meant a polymer of 6 (preferably 21, more preferably 39, 
most preferably 75) or more nucleotides conjugated to each other, 
20 including DNA and RNA that is isolated from a natural source or 
that is synthesized. In certain embodiments of the invention, 
longer nucleic acids are preferred, for example those of 300, 
600, 900 or more nucleotides and/or those having at least 50%, 
60%, 75%, 90%, 95% or 99% identity to the sequence shown in SEQ 
25 ID N0:1; SEQ ID NO: 2; SEQ ID NO: 3; SEQ ID NO: 4, SEQ ID NO: 5, SEQ 
ID N0:6, SEQ ID N0:7, SEQ ID N0:8, SEQ ID N0:9, or SEQ ID N0:19. 

The isolated nucleic acid of the present invention is unique 
in the sense that it is not found in a pure or separated state in 
nature. Use of the term "isolated" indicates that a naturally 
30 occurring sequence has been removed from its normal cellular 
(i.e., chromosomal) environment. Thus, the sequence may be in a 
cell-free solution or placed in a different cellular environment. 
The term does not imply that the sequence is the only nucleotide 
chain present, but that it is essentially free (about 90-95% pure 
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at least) of non-nucleotide material naturally associated with 
it, and thus is distinguished from isolated chromosomes. 

By the use of the term "enriched" in reference to nucleic 
acid is meant that the specific DNA or RNA sequence constitutes 
a significantly higher fraction (2-5 fold) of the total DNA or 
RNA present in the cells or solution of interest than in normal 
or diseased cells or in the cells from which the sequence was 
taken. This could be caused by a person by preferential 
reduction in the amount of other DNA or RNA present, or by a 
preferential increase in the amount of the specific DNA or RNA 
sequence, or by a combination of the two. However, it should be 
noted that "enriched" does not imply that th^re are no other DNA 
or RNA sequences present, just that the relative amount of the 
sequence of interest has been significantly increased. The term 
"significant" is used to indicate that the level of increase is 
useful to the person making such an increase, and generally means 
an increase relative to other nucleic acids of about at least 2- 
fold, more preferably at least 5- to 10-fold or even more. The 
term also does not imply that there is no DNA or RNA from other 
20 sources. The other source DNA may, for example, comprise DNA 
from a yeast or bacterial genome, or a cloning vector such as 
pUC19. This term distinguishes from naturally occurring events, 
such as viral infection, or tumor type growths, in which the 
level of one mRNA may be naturally increased relative to other 
species of mRNA. That is, the term is meant to cover only those 
situations in which a person has intervened to elevate the 
proportion of the desired nucleic acid. 

It is also advantageous for some purposes that a nucleotide 
sequence be in purified form. The term "purified" in reference 
to nucleic acid does not require absolute purity (such as a 
homogeneous preparation) . Instead, it represents an indication 
that the sequence is relatively more pure than in the natural 
environment (compared to the natural level this level should be 
at least 2-5 fold greater, e.g., in terms of mg/mL) . Individual 
35 clones isolated from a cDNA library may be purified 
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electrophoretic homogeneity. The claimed DNA molecules obtained 
from these clones could be obtained directly from total DNA or 
from total RNA. The cDNA clones are not naturally occurring, but 
rather are preferably obtained via manipulation of a partially 
5 purified naturally occurring substance (messenger RNA) . The 
construction of a cDNA library from mRNA involves the creation of 
a synthetic substance (cDNA) and pure individual cDNA clones can 
be isolated from the synthetic library by clonal selection of the 
cells carrying the cDNA library. Thus, the process which 
10 includes the construction of a cDNA library from mRNA and 
isolation of distinct cDNA clones yields an approximately 10^- 
fold purification of the native message. Tl^us, purification of 
at least one order of magnitude, preferably two or three orders, 
and more preferably four or five orders of magnitude is expressly 

15 contemplated. 

The term "expresses a product" as used herein refers to the 
production of proteins from a nucleic acid vector containing 
genes within a cell. The nucleic acid vector is transfected into 
cells using well known techniques in the art as described herein. 
20 The "product" may, or may not, be naturally present in the cell. 

The term "nucleic acid vector" relates to a single- or 
double-stranded circular nucleic acid molecule that can be 
transfected into cells and replicated within or independently of 
a cell genome. A circular double-stranded nucleic acid molecule 
25 can be cut and thereby linearized upon treatment with restriction 
enzymes. An assortment of nucleic acid vectors, restriction 
enzymes, and the knowledge of the nucleotide sequences cut by 
restriction enzymes are readily available to those skilled in the 
art. A nucleic acid molecule encoding a desired product can be 
30 inserted into a vector by cutting the vector with restriction 
enzymes and ligating the pieces together, depending on the 
availability of useful restriction sites. However, there are 
many methods well-known in the art for the insertion of nucleic 
acid sequences into vectors. 
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The term ^'transf ecting" as used herein includes a number of 
methods to insert a nucleic acid vector or other nucleic acid 
molecules into a cellular organism. These methods involve a 
variety of techniques, such as treating the cells with high 
5 concentrations of salt, an electric field, detergent/ or DMSO to 
render the outer membrane or wall of the cells permeable to 
nucleic acid molecules of interest or use of various viral 
transduction strategies . 

The term "^converts" as used herein refers to changing one 
10 compound into another compound, preferably enzymatically . The 
"source compound" refers to the compound to be converted to the 
"target compound." The "target compound" includes not only the 
compound that is metabolized to form a detectable signal, but can 
also include intermediates along the path to a detectable signal. 
15 This is particularly preferred if the target compound is a 
surrogate target. By "surrogate target compound" is meant a 
target that is used because the preferable target cannot be used 
for any of several potential reasons (e.g. if it doesn't cross 
membranes, has a short half-life, easily broken down, etc.). The 
20 "target compound" also includes interconvertible compounds. By 
interconvertible" is meant that a pathway exists in the tester 
strain to convert the compound to the target compound. 

The term "contacting" as used herein refers to mixing a 
solution comprising the test nucleic acid with a liquid medium 
25 bathing the cells of the methods. The solution comprising the 
nucleic acid may also comprise other components, such as dimethyl 
sulfoxide (DMSO), which facilitates the uptake of the test 
nucleic acid into the cells of the methods. This may also be 
done by other methods well-known in the art including, but not 
30 limited to, transfection or transformation techniques. The 
solution comprising the test nucleic acid may be added to the 
medium bathing the cells by utilizing a delivery apparatus, such 
as a pipet-based device or syringe-based device. 

The term "cell" as used herein includes the typical 
35 definition of a cell, and is further specifically intended to 
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include "cell-free" systems comprising the cellular machinery 
necessary to express the nucleic acid of the invention. By 
"cellular machinery" is meant the cellular components present in 
cell-free transcription and/or translation systems. Such systems 
are well-known in the art. In particular, the "cell" lacks the 
ability to convert a source compound into a target compound, 
prior to the addition of test nucleic acid sequences. The term 
"lacks the ability" also includes cells in which the activity may 
be present but is at too low a level to provide a detectable 
signal, or is low enough that an additional activity is 
detectably different. By "detectably different" is meant able to 
be measured over the background level (e.g^. the level of the 
signal endogenously present in the "cell" and in the equipment 
used to measure the signal) by an amount greater than the level 
15 of error present in the method of measuring. 

The term "detectable signal" as used herein refers to a 
method of identification of the nucleic acids of interest e.g. by 
color, fluorescence, luminescence or growth. 

In preferred embodiments of the method for screening nucleic 
20 acid that converts a source compound into a target compound, the 
one or more nucleic acid sequences encodes a metabolic pathway 
not normally present in said cell. A "metabolic pathway" 
consists of a series of protein catalysts (enzymes) which 
catalyze the conversion of a starting material to a product. And 
25 further, by "metabolic pathway" is meant the enzymes, and genes 
that encode them, that metabolize a source compound to a target 
compound. 

In other preferred embodiments, the nucleic acid is selected 
from the group consisting of mutagenized DNA, environmental DNA, 

30 combinatorial libraries, and recombinant DNA. Preferably, the 
environmental DNA is selected from the group consisting of mud, 
soil, sewage, flood control channels, sand, and water. 
Preferably the mutagenized DNA is the result of enzyme 
mutagenesis where the mutagenesis is selected from the group 

35 consisting of random, chemical, PCR-based, and directed 
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mutagenesis. The directed mutagenesis is to include, for 
example, DNA shuffling. Preferably the enzymes to be mutagenized 
in this way are selected from the group consisting of lactonases, 
esterhydrolases, and reductases. 
5 The term "environmental" as used herein refers to nucleic 

acids extracted from the environment, e.g. from mud, soil, or 
water. By "extracted" is meant isolated, enriched, or purified 
as defined above. The environmental sample can be directly 
extracted without prior laboratory culture, or can be pre- 
10 cultured, for example, in the presence of a growth selective 
agent. Methods are known in the art and examples are described 
herein. 

In still other preferred embodiments of the method for 
screening nucleic acid that converts a source compound into a 

15 target compound, the detectable signal is selected from a group 
consisting of growth, fluorescence, luminescence, and color. 
Methods for detecting these signals are well-known in the art. 
Preferably, the detectable signal is growth, and the target 
compound provides an element or factor required for growth. 

20 Preferably the target compound is selected from the group 
consisting of ascorbate and 2-keto~L-gulonate (2-KLG) , most 
preferably ascorbate. Preferably the element is selected from 
the group consisting of carbon, nitrogen, sulfur, and 
phosphorous. Most preferably, the element is carbon. 

25 Alternatively, the essential factor is another essential 
nutrient. By "required for growth" is meant that the organism 
does not grow detectably in the absence of the element. By 
"provides an element" is meant that the compound can be 
metabolized by the organism, and that the result of this 

30 metabolism is the element in some form, e.g. carbon or carbon 
dioxide . 

In other preferred embodiments of the method for screening 
nucleic acid that converts a source compound into a target 
compound, the source compound is selected from the group 
35 consisting of 2-keto-L-gulonate (2-KLG), 2, 5-deoxy-keto-gulonate 
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(2,5-DKG), L-idonate (L-IA) , L-gulonate (L-GuA) , and glucose, and 
most preferably 2-KLG. 

In still other preferred embodiments of the method for 
screening nucleic acid that converts a source compound into a 
5 target compound, the cell naturally expresses the one or more 
genes encoding one or more proteins that in the presence of the 
target compound provide a detectable signal. Alternatively, the 
cell can be genetically manipulated to express the one or more 
genes encoding one or more proteins that in the presence of the 
10 target compound provide a detectable signal. In both cases, the 
one or more proteins are preferably Yia operon-related 
polypeptides. The one or more genes are preferably under the 
control of an inducible promoter. The inducible promoter 
preferably comprises the trp-lac hybrid promoter, the iacO 
15 operator, and the lacT repressor. 

By "naturally expresses" is meant that the genes encoding 
the proteins are present in the cell in its natural state, e.g. 
in nature, prior to culture in the laboratory. The genes may or 
may not be expressed in the natural state, or may or may not be 
20 expressed constitutively or inducibly. By '^genetically 
manipulated to express" is meant the transfection of the desired 
genes into the cell by methods well-known in the art, examples of 
which are described herein. 

The term "promoter" as used herein, refers to nucleic acid 
25 sequence needed for gene sequence expression. Promoter regions 
vary from organism to organism, but are well known to persons 
skilled in the art for different organisms. For example, in 
prokaryotes, the promoter region contains both the promoter 
(which directs the initiation of RNA transcription) as well as 
30 the DNA sequences which, when transcribed into RNA, will signal 
synthesis initiation. Such regions will normally include those 
5 ' -non-coding sequences involved with initiation of transcription 
and translation, such as the TATA box, capping sequence, CAAT 
sequence, ribosome binding site, start codon, and the like. By 
35 "inducible promoter" is meant a promoter which is only "on" in 
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the presence of an inducer. The "inducer" is typically a small 
molecule. Inducible promoters and inducers are well-known in the 
art and examples are given herein. 

The term "Yia operon-related polypeptides" as used herein 
refers to polypeptides comprising 12 (preferably 15, more 
preferably 20, most preferably 30) or more contiguous amino acids 
set forth in the full-length amino acid sequence of SEQ ID NO:10; 
31 (preferably 35, more preferably 40, most preferably 50) or 
more contiguous amino acids set forth in the full-length amino 
acid sequence of SEQ ID NO: 11; 5 (preferably 10, more preferably 
15, most preferably 25) or more contiguous amino acids set forth 
in the full-length amino acid sequence of S^Q ID NO: 12, SEQ ID 
NO: 13, or SEQ ID NO: 14; 17 (preferably 20, more preferably 25, 
most preferably 35) or more contiguous amino acids set forth in 
the full-length amino acid sequence of SEQ ID NO: 15, SEQ ID 
NO: 17, or SEQ ID NO: 18; 11 (preferably 15, more preferably 20, 
most preferably 30) or more contiguous amino acids set forth in 
the full-length amino acid sequence of SEQ ID NO: 16; or a 
functional derivative thereof as described herein. In certain 
aspects, polypeptides of 100, 200, 300 or more amino acids are 
preferred. The Yia operon-related polypeptide can be encoded by 
its corresponding full-length nucleic acid sequence or any 
portion of its corresponding full-length nucleic acid sequence, 
so long as a functional activity of the polypeptide is retained 
25 (see, Exanples section) . It is well known in the art that due to 
the degeneracy of the genetic code numerous different nucleic 
acid sequences can code for the same amino acid sequence. 
Equally, it is also well known in the art that conservative 
changes in amino acid can be made to arrive at a protein or 
polypeptide which retains the functionality of the original. In 
both cases, all permutations are within the embodiments of the 
invention . 

The amino acid sequence of the Yia operon-related 
polypeptide will be substantially similar to the sequence shown 
35 in SEQ ID NO:10, SEQ ID N0:11, SEQ ID N0:12, SEQ ID N0:13, SEQ ID 
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NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, or SEQ ID NO: 18, 
or fragments thereof. A sequence that is substantially similar 
to the sequence of SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ 
ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, 
or SEQ ID NO: 18 will preferably have at least 90% identity (more 
preferably at least 95% and most preferably 98-100%) to the 
sequence of SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID 
NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID N0:16, SEQ ID NO: 17, or 
SEQ ID NO: 18 using a Smith-Waterman protein-protein search. 

By "identity" is meant a property of sequences that measures 
their similarity or relationship. Identity is measured by 
dividing the number of identical residues by^the total number of 
residues and gaps and multiplying the product by 100. "Gaps" are 
spaces in an alignment that are the result of additions or 
15 deletions of amino acids. Thus, two copies of exactly the same 
sequence have 100% identity, but sequences that are less highly 
conserved, and have deletions, additions, or replacements, may 
have a lower degree of identity. Those skilled in the art will 
recognize that several computer programs are available for 
20 determining sequence identity. For example, the computer 
algorithm BLAST is preferably used to search for homologous 
sequences in a database, and CLUSTAL is used to perform 
alignments. Identity and similarity determinations can be made 
using a Smith-Waterman protein-protein search, for example. 
25 In still other preferred embodiments of the method for 

screening nucleic acid that converts a source compound into a 
target compound, the cell grows on ascorbate and does not grow on 
2-KLG. Alternatively, the cell may grow on 2-KLG and not grow on 
2,5-DKG, Preferably the cells are bacteria. Most preferably, 
30 the cell selective for ascorbate is Klebsiella oxytoca. By 
"grows on" is meant that the cell can utilize the compound (e.g. 
ascorbate or 2-KLG) as a source of carbon in the minimal 
essential media. However, the cell is unable to grow in the 
minimal essential media in the absence of the provided carbon 



BNSDCKID: <WO_ O022170A1..IA> 



wo 00/22170 



PCT/US99/23862 



10 



13 

source. Thus, this provides a selective tool for the identifi- 
cation of the nucleic acid encoding the polypeptides of interest. 

A second aspect of the invention features an isolated, 
enriched, or purified nucleic acid molecule encoding one or more 
Yia operon-related polypeptides selected from " the group 
consisting of YiaJ, YiaK, YiaL, ORFl, YiaX2, LyxK, YiaQ, YiaR, 
and YiaS. 

In preferred embodiments, the isolated, enriched, or 
purified nucleic acid molecule encoding one or more Yia operon- 
related polypeptides comprises a nucleotide sequence that: (a) 
encodes a polypeptide having the full length amino acid sequence 
set forth in SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID 
NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, or 
SEQ ID NO: 18; (b) is the complement of the nucleotide sequence of 
15 (a); and (c) hybridizes under highly stringent conditions to the 
nucleotide molecule of (a) and encodes a naturally occurring 
polypeptide. 

In another preferred embodiment, the invention features an 
isolated, enriched, or purified nucleic acid molecule, wherein 
said nucleic acid molecule comprises the nucleotide sequence set 
forth in SEQ ID NO: 19. The nucleic acid molecule comprises: (a) 
one or more nucleotide sequences that are set forth in SEQ ID 
N0:1, SEQ ID N0:2, SEQ ID N0:3, SEQ ID N0:4, SEQ ID N0:5, SEQ ID 
NO: 6, SEQ ID N0:7, SEQ ID N0:8, or SEQ ID NO: 9; (b) the 
complement of the nucleotide sequence of (a); (c) nucleic acid 
that hybridizes under stringent conditions to the nucleotide 
molecule of (a); (d) the full length sequence of SEQ ID N0:19, 
except that it lacks one or more of the sequences set forth in 
SEQ ID N0:1, SEQ ID N0:2, SEQ ID N0:3, SEQ ID N0:4, SEQ ID N0:5, 
SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, or SEQ ID NO: 9; or (e) is 
the complement of the nucleotide sequence of (d) . 

The term "complement" refers to two nucleotides that can 
form multiple thermodynamically favorable interactions with one 
another. For example, adenine is complementary to thymine as 
35 they can form two hydrogen bonds. Similarly, guanine and 
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cytosine are complementary since they can form three hydrogen 
bonds. A nucleotide sequence is the complement of another 
nucleotide sequence if the nucleotides of the first sequence are 
complementary to the nucleotides of the second sequence. The 
5 percent of complementarity (i.e. how many nucleotides from one 
strand form multiple thermodynamically favorable interactions 
with the other strand compared with the total number of 
nucleotides present in the sequence) indicates the extent of 
complementarity of two sequences. 

10 Various low or high stringency hybridization conditions may 

be used depending upon the specificity and selectivity desired. 
These conditions are well-known to those skilled in the art. 
Under stringent hybridization conditions only highly comple- 
mentary nucleic acid sequences hybridize. Preferably, such 

15 conditions prevent hybridization of nucleic acids having 1 or 2 
mismatches out of 20 contiguous nucleotides. 

By "stringent hybridization conditions" is meant 
hybridization conditions at least as stringent as the following: 
hybridization in 50% formamide, 5X SSC, 50 mM NaH^PO,, pH 6.8, 

20 0.5% SDS, 0.1 mg/mL sonicated salmon sperm DNA, and 5X Denhart ' s 
solution at 42 overnight; washing with 2X SSC, 0.1% SDS at 45 
^'C; and washing with 0.2X SSC, 0.1% SDS at 45 "^C. 

In other preferred embodiments the isolated, enriched, or 
purified nucleic acid molecule encoding one or more Yia operon- 

25 related polypeptides further comprises a vector or promoter 
effective to initiate transcription in a host cell. Preferably, 
the vector or promoter comprises the trp-lac hybrid promoter, the 
lacO operator, and the lacl"^ repressor gene. In still other 
preferred embodiments, the nucleic acid molecule is isolated, 

30 enriched, or purified from a bacteria, preferably Klebsiella 
oxytoca. 

The invention also features recombinant nucleic acid, 
preferably in a cell or an organism. The recombinant nucleic 
acid may contain a sequence set forth in SEQ ID N0:1, SEQ ID 
35 N0:2, SEQ ID N0:3, SEQ ID N0:4, SEQ ID N0:5, SEQ ID N0:6, SEQ ID 
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NO: 7, SEQ ID NO: 8, or SEQ ID NO: 9, or a functional derivative 
thereof, and a vector or a promoter effective to initiate 
transcription in a host cell. The recombinant nucleic acid can 
alternatively contain a transcriptional initiation region 
5 functional in a cell, a sequence complementary to an RNA sequence 
encoding one or more Yia operon-related polypeptides and a 
transcriptional termination region functional in a cell. 

In preferred embodiments, the isolated, enriched, purified, 
recombinant, or recombinant in a cell, nucleic acid comprises, 
10 consists essentially of, or consists of the full-length nucleic 
acid sequence set forth in SEQ ID N0:1, SEQ ID NO: 2, SEQ ID NO: 3, 
SEQ ID N0:4, SEQ ID NO : 5 , SEQ ID N0:6, SEQ ir:^N0:7, SEQ ID NO : 8 , 
or SEQ ID NO: 9, encodes the full-length amino acid sequence of 
SEQ ID NO:10, SEQ ID N0:11, SEQ ID N0:12, SEQ ID N0:13, SEQ ID 
15 N0:14, SEQ ID N0:15, SEQ ID N0:16, SEQ ID N0:17, or SEQ ID N0:18, 
a functional derivative thereof, or at least 35, 40, 45, 50, 60, 
75, 100, 200, or 300 contiguous amino acids of SEQ ID NO: 10, SEQ 
ID N0:11, SEQ ID N0:12, SEQ ID NO:13, SEQ ID N0:14, SEQ ID N0:15, 
SEQ ID NO: 16, SEQ ID NO: 17, or SEQ ID NO: 18. The Yia operon- 
20 related polypeptides comprise, consist essentially of, or consist 
of at least 35, 40, 45, 50, 60, 75, 100, 200, or 300 contiguous 
amino acids of SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID 
N0:13, SEQ ID N0:14, SEQ ID N0:15, SEQ ID N0:16, SEQ ID N0:17, or 
SEQ ID NO: 18. The nucleic acid may be isolated from a natural 
25 source by cDNA cloning or by subtract ive hybridization. The 
natural source may be prokaryotic, eukaryotic, or protozoal, 
preferably bacterial, from the environment, and the nucleic acid 
may be synthesized by the triester method or by using an 
automated DNA synthesizer. In other preferred embodiments, the 
30 nucleic acid molecule is isolated, enriched, or purified from a 
bacteria, preferably Klebsiella oxytoca. 

In yet other preferred embodiments, the nucleic acid is a 
conserved or unique region, for example those useful for: the 
design of hybridization probes to facilitate identification and 
35 cloning of additional polypeptides, the design of PGR probes to 
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facilitate cloning of additional polypeptides, obtaining 
antibodies to polypeptide regions, and designing antisense 
oligonucleotides . 

By "conserved nucleic acid regions", are meant regions 
5 present on two or more nucleic acids encoding a Yia operon- 
related polypeptide, to which a particular nucleic acid sequence 
can hybridize under lower stringency conditions. Examples of 
lower stringency conditions are provided in Abe, et al. (J. Biol. 
Chem. 19:13361-13368, 1992), hereby incorporated by reference 

10 herein in its entirety, including any drawings, figures, or 
tables. Preferably, conserved regions differ by no more than 5 
out of 20 nucleotides. ^ 

By "unique nucleic acid region" is meant a sequence present 
in a nucleic acid coding for a Yia operon-related polypeptide 

15 that is not present in a sequence coding for any other naturally 
occurring polypeptide. Such regions preferably encode 12 
(preferably 15, more preferably 20, most preferably 30) or more 
contiguous amino acids set forth in the full-length amino acid 
sequence of SEQ ID NO: 10; 30 (preferably 35, more preferably 40, 

20 most preferably 50) or more contiguous amino acids set forth in 
the full-length amino acid sequence of SEQ ID NO: 11; 5 
(preferably 10, more preferably 15, most preferably 25) or more 
contiguous amino acids set forth in the full-length amino acid 
sequence of SEQ ID NO: 12, SEQ ID NO: 13, or SEQ ID NO: 14; 17 

25 (preferably 20, more preferably 25, most preferably 35) or more 
contiguous amino acids set forth in the full-length amino acid 
sequence of SEQ ID NO: 15, SEQ ID NO: 17, or SEQ ID NO: 18; 11 
(preferably 15, more preferably 20, most preferably 30) or more 
contiguous amino acids set forth in the full-length amino acid 

30 sequence of SEQ ID NO: 16. In particular, a unique nucleic acid 
region is preferably of bacterial origin. 

A third aspect of the invention features a nucleic acid 
probe for the detection of nucleic acid encoding one or more Yia 
operon-related polypeptides, selected from the group consisting 

35 of YiaJ, YiaK, YiaL, ORFl, YiaX2, LyxK, YiaQ, YiaR, and YiaS, in 
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a sample. Preferably, the nucleic acid probe encodes a 
polypeptide that is a fragment of the protein encoded by the full 
length amino acid sequence set forth in SEQ ID NO: 10, SEQ ID 
N0:11, SEQ ID N0:12, SEQ ID N0:13, SEQ ID N0:14, SEQ ID N0:15, 
5 SEQ ID NO: 16, SEQ ID NO: 17, or SEQ ID NO: 18. The nucleic acid 
probe contains a nucleotide base sequence that will hybridize to 
the full-length sequence set forth in SEQ ID N0:1, SEQ ID N0:2, 
SEQ ID NO:3, SEQ ID N0:4, SEQ ID N0:5, SEQ ID NO : 6 , SEQ ID NO : 7 , 
SEQ ID NO: 8, or SEQ ID NO: 9, or a functional derivative thereof. 
10 Hybridization is preferably under stringent conditions. 

In preferred embodiments, the nucleic acid probe hybridizes 
to nucleic acid encoding at least 12, 32, 75^ 90, 105, 120, 150, 
200, 250, 3 00 or 350 contiguous amino acids set forth in the 
full-length amino acid sequence of SEQ ID NO: 10; at least 30, 75, 

15 90, 105, 120, 150, 200, 250, 300 or 350 contiguous amino acids 
set forth in the full-length amino acid sequence of SEQ ID N0:11; 
at least 5, 12, 32, 75, 90, 105, 120, 150, 200, 250, 300 or 350 
contiguous amino acids set forth in the full-length amino acid 
sequence of SEQ ID NO: 12, SEQ ID NO: 13, or SEQ ID NO: 14; at least 

20 17, 32, 75, 90, 105, 120, 150, 200, 250, 300 or 350 contiguous 
amino acids set forth in the full-length amino acid sequence of 
SEQ ID N0:15, SEQ ID N0:17, or SEQ ID N0:18; at least 11, 32, 75, 
90, 105, 120, 150, 200, 250, 300 or 350 contiguous amino acids 
set forth in the full-length amino acid sequence of SEQ ID NO: 16, 

25 or a functional derivative thereof. 

Methods for using the probes include detecting the presence 
or amount of Yia operon- related RNA in a sample by contacting the 
sample with a nucleic acid probe under conditions such that 
hybridization occurs and detecting the presence or amount of the 

30 probe bound to Yia operon-related RNA. The nucleic acid duplex 
formed between the probe and a nucleic acid sequence coding for 
a Yia operon-related polypeptide may be used in the 
identification of the sequence of the nucleic acid detected 
(Nelson et a J . , in Non-isotopic DNA Probe Techniques, Academic 

35 Press, San Diego, Kricka, ed., p. 275, 1992, hereby incorporated 
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by reference herein in its entirety, including any drawings, 
figures, or tables) , Kits for performing such methods may be 
constructed to include a container means having disposed therein 
a nucleic acid probe. 
5 A fourth aspect of the invention features a recombinant cell 

comprising a nucleic acid molecule encoding one or more Yia 
operon-related polypeptides selected from the group consisting of 
YiaJ, YiaK, YiaL, ORFl, YiaX2, LyxK, YiaQ, YiaR, and YiaS. In 
such cells, the nucleic acid may be under the control of the 

10 genomic regulatory elements, or, preferably, may be under the 
control of exogenous regulatory elements including an exogenous 
promoter. By "exogenous" is meant a promoter that is not 
normally coupled in vivo transcriptionally to the coding sequence 
for the Yia operon-related polypeptides. 

15 In preferred embodiments, the recombinant cell comprises 

nucleic acid encoding a polypeptide that is a fragment of the 
protein encoded by the amino acid sequence set forth in SEQ ID 
NO:10, SEQ ID N0:11, SEQ ID N0:12, SEQ ID N0:13, SEQ ID N0:14, 
SEQ ID N0:15, SEQ ID N0:16, SEQ ID N0:17, or SEQ ID N0:18. By 

20 "fragment," is meant an amino acid sequence present in a Yia 
operon polypeptide. Preferably, such a sequence comprises at 
least 12, 32, 75, 90, 105, 120, 150, 200, 250, 300 or 350 
contiguous amino acids set forth in the full-length amino acid 
sequence of SEQ ID NO:10; at least 30, 75, 90, 105, 120, 150, 

25 200, 250, 300 or 350 contiguous amino acids set forth in the 
full-length amino acid sequence of SEQ ID NO: 11; at least 5, 12, 
32, 75, 90, 105, 120, 150, 200, 250, 300 or 350 contiguous amino 
acids set forth in the full-length amino acid sequence of SEQ ID 
N0:12, SEQ ID N0:13, or SEQ ID N0:14; at least 17, 32, 75, 90, 

30 105, 120, 150, 200, 250, 300 or 350 contiguous amino acids set 
forth in the full-length amino acid sequence of SEQ ID NO: 15, SEQ 
ID N0:17, or SEQ ID N0:18; at least 11, 32, 75, 90, 105, 120, 
150, 200, 250, 300 or 350 contiguous amino acids set forth in the 
full-length amino acid sequence of SEQ ID NO: 16. 
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Alternatively, the recombinant cell comprises the nucleic 
acid sequence set forth in SEQ ID NO: 19, or comprises: (a) one or 
more nucleotide sequences that are set forth in SEQ ID N0:1, SEQ 
ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ 
ID NO: 7, SEQ ID NO: 8, or SEQ ID NO: 9; (b) the complement of the 
nucleotide sequence of (a) ; (c) nucleic acid that hybridizes 
under stringent conditions to the nucleotide molecule of (a) ; (d) 
the full length sequence of SEQ ID NO: 19, except that it lacks 
one or more of the sequences set forth in SEQ ID N0:1, SEQ ID 
N0:2, SEQ ID N0:3, SEQ ID N0:4, SEQ ID NO: 5, SEQ ID N0:6, SEQ ID 
NO: 7, SEQ ID NO: 8, or SEQ ID NO: 9; and (e) is the complement of 
the nucleotide sequence of (d) . Preferably, ^he recombinant cell 
further comprises a vector or promoter effective to initiate 
transcription of the above-identified nucleic acid in the cell. 
15 Preferably, the vector or promoter comprises the trp-lac hybrid 
promoter, the lacO operator, and the iacJ" repressor gene. 
Preferably, the recombinant cell is a bacteria, more preferably 
Klebsiella oxytoca. 

Other preferred embodiments of this aspect of the invention 
include a recombinant cell useful for screening for one or more 
nucleic acid sequences that express one or more products that 
convert a source compound into a target compound, where the cell 
expresses one or more genes, comprising an inducible promoter, 
and where the one or more genes encodes one or more proteins that 
25 in the presence of the target compound and an inducer provide a 
detectable signal, where the detectable signal indicates the 
presence of the one or more nucleic acid sequences. Preferably, 
the detectable signal is selected from a group consisting of 
growth, fluorescence, luminescence, and color, and most 
30 preferably is growth. 

In preferred embodiments, of the recombinant cell useful for 
screening, the one or more nucleic acid sequences encodes a 
metabolic pathway not normally present in said cell. In other 
preferred embodiments, the nucleic acid is selected from the 
35 group consisting of mutagenized DNA, environmental DNA, 
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combinatorial libraries, and recombinant DNA, Preferably, the 
environmental DNA is selected from the group consisting of mud, 
soil, sewage, flood control channels, sand, and water- 
Preferably the mutagenized DNA is the result of enzyme 
5 mutagenesis where the mutagenesis is selected from the group 
consisting of random, chemical, PCR-based, and directed 
mutagenesis. The directed mutagenesis is to include, for 
example, DNA shuffling. Preferably the enzymes to be mutagenized 
in this way are selected from the group consisting of lactonases, 

10 esterhydrolases, and reductases. 

Additionally in this preferred embodiment, the cell 
preferably requires the presence of the target compound and the 
inducer for growth. Preferably, the target compound is selected 
from the group consisting of ascorbate and 2-KLG. In addition, 

15 the one or more genes are preferably under the control of an 
inducible promoter, preferably comprising the trp-lac hybrid 
promoter, the lacO operator, and the lacl"^ repressor gene. 
Preferably, the one or more proteins encoded by the one or more 
genes are one or more Yia operon-related polypeptides. 

20 Preferably, the cell naturally expresses the one or more genes, 
or has been genetically manipulated to express the one or more 
genes. Preferably, the cell is a bacteria, most preferably 
Klebsiella oxytoca . 

A fifth aspect of the invention features one or more 

25 isolated, enriched, or purified Yia operon-related polypeptides 
selected from the group consisting of YiaJ, YiaK, YiaL, ORFl, 
YiaX2, LyxK, YiaQ, YiaR, and YiaS. 

By "isolated" in reference to a polypeptide is meant a 
polymer of 6 (preferably 12, more preferably 18, most preferably 

30 25, 32, 40, or 50) or more amino acids conjugated to each other, 
including polypeptides that are isolated from a natural source or 
that are synthesized. In certain aspects longer polypeptides are 
preferred, such as those with 100, 200, 300, 400, or more 
contiguous amino acids of the sequence set forth in SEQ ID NO: 10, 
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SEQ ID N0:11, SEQ ID N0:12, SEQ ID N0:13, SEQ ID N0:14, SEQ ID 
N0:15, SEQ ID N0:16, SEQ ID N0:17 or SEQ ID N0:18. 

The isolated polypeptides of the present invention are 
unique in the sense that they are not found in a pure or 
5 separated state in nature. Use of the term "isolated" indicates 
that a naturally occurring sequence has been removed from its 
normal cellular environment. Thus, the sequence may be in a 
cell -free solution or placed in a different cellular environment. 
The term does not imply that the sequence is the only amino acid 
10 chain present, but that it is essentially free (about 90-95% pure 
at least) of no-amino acid-based material naturally associated 
with it. 

By the use of the term "enriched" in reference to a 
polypeptide is meant that the specific amino acid sequence 

15 constitutes a significantly higher fraction (2-5 fold) of the 
total amino acid sequences present in the cells or solution of 
interest than in normal or diseased cells or in the cells from 
which the sequence was taken. This could be caused by a person 
by preferential reduction in the amount of other amino acid 

20 sequences present, or by a preferential increase in the amount of 
the specific amino acid sequence of interest, or by a combination 
of the two. However, it should be noted that enriched does not 
imply that there are no other amino acid sequences present, just 
that the relative amount of the sequence of interest has been 

25 significantly increased. The term significant here is used to 
indicate that the level of increase is useful to the person 
making such an increase, and generally means an increase relative 
to other amino acid sequences of about at least 2- fold, more 
preferably at least 5- to 10 -fold or even more. The term also 

30 does not imply that there is no amino acid sequence from other 
sources. The other source of amino acid sequences may, for 
example, comprise amino acid sequence encoded by a yeast or 
bacterial genome, or a cloning vector such as pUC19, The term is 
meant to cover only those situations in which man has intervened 

35 to increase the proportion of the desired amino acid sequence. 
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It is also advantageous for some purposes that an amino acid 
sequence be in purified form. The teirm "purified" in reference 
to a polypeptide does not require absolute purity (such as a 
homogeneous preparation) ; instead, it represents an indication 
5 that the sequence is relatively purer than in the natural 
environment. Compared to the natural level this level should be 
at least 2-5 fold greater (e.g., in terms of mg/mL) . 
Purification of at least one order of magnitude, preferably two 
or three orders, and more preferably four or five orders of 
10 magnitude is expressly contemplated. The substance is preferably 
free of substances present in its natural environment at a 
functionally significant level, for exampl^ 90%, 95%, or 99% 
pure. 

In preferred embodiments, the polypeptide is a fragment of 

15 the protein encoded by the full length amino acid sequence set 
forth in SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, 
SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, or SEQ ID 
NO: 18, Preferably, the Yia operon polypeptide contains at least 
12, 32, 75, 90, 105, 120, 150, 200, 250, 300 or 350 contiguous 

20 amino acids set forth in the full-length amino acid sequence of 
SEQ ID NO:10; at least 30, 75, 90, 105, 120, 150, 200, 250, 300 
or 350 contiguous amino acids set forth in the full-length amino 
acid sequence of SEQ ID N0:11; at least 5, 12, 32, 75, 90, 105, 
120, 150, 200, 250, 300 or 350 contiguous amino acids set forth 

25 in the full-length amino acid sequence of SEQ ID NO: 12, SEQ ID 
N0:13, or SEQ ID N0:14; at least 17, 32, 75, 90, 105, 120, 150, 
200, 250, 300 or 350 contiguous amino acids set forth in the 
full-length amino acid sequence of SEQ ID NO: 15, SEQ ID NO: 17, or 
SEQ ID N0:18; at least 11, 32, 75, 90, 105, 120, 150, 200, 250, 

30 300 or 350 contiguous amino acids set forth in the full-length 
amino acid sequence of SEQ ID NO: 16, or a functional derivative 
thereof , 

The polypeptide can be isolated from a natural source by 
methods well-known in the art. The natural source may be 
35 protozoal, eukaryotic, or prokaryotic, and the polypeptide may be 
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synthesized using an automated polypeptide synthesizer. 
Preferably, the polypeptide is isolated, enriched, or purified 
from bacteria, most preferably Klebsiella oxytoca. 

In some embodiments the invention includes one or more 
5 recombinant Yia operon-related polypeptides. By "recombinant Yia 
operon-related polypeptide" is meant a polypeptide produced by 
recombinant DNA techniques such that it is distinct from a 
naturally occurring polypeptide either in its location (e.g., 
present in a different cell or tissue than found in nature) , 

10 purity or structure. Generally, such a recombinant polypeptide 
will be present in a cell in an amount different from that 
normally observed in nature. 

In a sixth aspect, the invention features an antibody (e.g., 
a monoclonal or polyclonal antibody) having specific binding 

15 affinity to a Yia operon-related polypeptide or a Yia operon- 
related polypeptide fragment. In preferred embodiments, the yia 
operon-related polypeptide is selected from the group consisting 
of YiaJ, YiaK, YiaL, ORFl, YiaX2, LyxK, YiaQ, YiaR, and YiaS . 

By "specific binding affinity" is meant that the antibody 

20 binds to the target Yia operon-related polypeptide with greater 
affinity than it binds to other polypeptides under specified 
conditions. Antibodies or antibody fragments are polypeptides 
which contain regions that can bind other polypeptides. The term 
"specific binding affinity" describes an antibody that binds to 

25 a Yia operon polypeptide with greater affinity than it binds to 
other polypeptides under specified conditions. 

The term "polyclonal" refers to antibodies that are 
heterogeneous populations of antibody molecules derived from the 
sera of animals immunized with an antigen or an antigenic func- 

30 tional derivative thereof. For the production of polyclonal 
antibodies, various host animals may be immunized by injection 
with the antigen. Various adjuvants may be used to increase the 
immunological response, depending on the host species. 

"Monoclonal antibodies" are substantially homogenous 

35 populations of antibodies to a particular antigen. They may be 
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obtained by any technique which provides for the production of 
antibody molecules by continuous cell lines in culture. 
Monoclonal antibodies may be obtained by methods known to those 
skilled in the art (Kohler et al,. Nature 256:495-497, 1975, and 
5 U.S. Patent No, 4,376,110, both of which are hereby incorporated 
by reference herein in their entirety including any figures, 
tables, or drawings). 

The term "antibody fragment" refers to a portion of an 
antibody, often the hypervariable region and portions of the 

10 surrounding heavy and light chains, that displays specific 
binding affinity for a particular molecule. A hypervariable 
region is a portion of an antibody that phys^ically binds to the 
polypeptide target. 

Antibodies or antibody fragments having specific binding 

15 affinity to a Yia operon- related polypeptide of the invention may 
be used in methods for detecting the presence and/or amount of 
Yia operon polypeptide in a sample by probing the sartple with the 
antibody under conditions suitable for Yia operon-related- 
antibody immunocomplex formation and detecting the presence 

20 and/or amount of the antibody conjugated to the Yia operon- 
related polypeptide. Diagnostic kits for performing such methods 
may be constructed to include antibodies or antibody fragments 
specific for the Yia operon-related polypeptide as well as a 
conjugate of a binding partner of the antibodies or the 

25 antibodies themselves. 

An antibody or antibody fragment with specific binding 
affinity to a Yia operon-related polypeptide of the invention can 
be isolated, enriched, or purified from a prokaryotic or 
eukaryotic organism. Routine methods known to those skilled in 

30 the art enable production of antibodies or antibody fragments, in 
both prokaryotic and eukaryotic organisms. Purification, 
enrichment, and isolation of antibodies, which are polypeptide 
molecules, are described above. 

Antibodies having specific binding affinity to a Yia operon- 

35 related polypeptide of the invention may be used in methods for 
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detecting the presence and/or amount of Yia operon-related 
polypeptide in a sample by contacting the sample with the 
antibody under conditions such that an immunocomplex forms and 
detecting the presence and/or amount of the antibody conjugated 
5 to the Yia operon-related polypeptide. Diagnostic kits for 
performing such methods may be constructed to include a first 
container containing the antibody and a second container having 
a conjugate of a binding partner of the antibody and a label, 
such as, for example, a radioisotope. The diagnostic kit may 
10 also include notification of an FDA approved use and instructions 
therefor. 

In a seventh aspect, the invention featu:fes a hybridoma that 
produces an antibody having specific binding affinity to a Yia 
operon-related polypeptide or a Yia operon-related polypeptide 

15 fragment. By "hybridoma" is meant an immortalized cell line that 
is capable of secreting an antibody, for example an antibody to 
a Yia operon-related polypeptide of the invention. In preferred 
embodiments, the antibody to the Yia operon-related polypeptide 
comprises a sequence of amino acids that is able to specifically 

20 bind a Yia operon-related polypeptide of the invention. 

In an eighth aspect, the invention features a Yia operon- 
related polypeptide binding agent able to bind to a Yia operon- 
related polypeptide. The binding agent is preferably a purified 
antibody that recognizes an epitope present on a Yia operon- 

25 related polypeptide of the invention. Other binding agents 
include molecules that bind to Yia operon-related polypeptides 
and analogous molecules which bind to a Yia operon-related 
polypeptide. Such binding agents may be identified by using 
assays that measure Yia operon-related binding partner activity, 

30 such as those that measure growth or ascorbate metabolism. 

The invention also features a method for screening for other 
organisms containing a Yia operon-related polypeptide of the 
invention or an equivalent sequence. The method involves 
identifying the novel polypeptide in other organisms using 

35 techniques that are routine and standard in the art, such as 
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those described herein for identifying the Yia operon-related 
polypeptide of the invention or others standard in the art (e.g., 
cloning. Southern or Northern blot analysis, in situ 
hybridization, PGR amplification, etc.). 

A ninth aspect of the invention features a method for 
identifying a substance that converts a source compound to a 
target compound, comprising: contacting a cell with nucleic 
acid, where the nucleic acid expresses a product that converts a 
source compound into a target compound, and where the cell 
expresses one or more proteins which in the presence of the 
target compound provide a detectable signal; contacting the cell 
with a test substance; and monitoring the ^ detectable signal, 
where the detectable signal indicates the presence of the 
substance . 

15 In preferred embodiments of the method for identifying a 

substance that converts a source compound to a target compound, 
the substance is selected from the group consisting of 
antibodies, small organic molecules, peptidomimetics, and natural 
products. In other preferred embodiments, the detectable signal 
20 is selected from a group consisting of growth, fluorescence, 
luminescence, and color. Preferably, the detectable signal is 
growth, and the target compound is metabolizable to an element 
selected from the group consisting of carbon, nitrogen, sulfur, 
and phosphorous, most preferably carbon. Alternatively, the 
25 target compound is metabolizable to an essential nutrient. In 
still other preferred embodiments of the invention, the source 
compound is selected from the group consisting of 2-KLG, 2,5-DKG, 
L-IA, L-GuA, and glucose. 

In other highly preferred embodiments of the method for 
identifying a substance that converts a source compound to a 
target compound, the one or more proteins are one or more Yia 
operon-related polypeptides. Preferably, the Yia operon further 
comprises a vector or promoter effective to initiate 
transcription in a host cell, and most preferably the vector or 



30 
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promoter comprises the trp-lac hybrid promoter, the lacO 
operator, and the lad'' repressor gene. 

A tenth aspect of the invention features a method for 
detecting the presence, absence, or amount of a compound in a 
5 sample comprising: contacting the sample with a cell, where the 
cell expresses one or more genes encoding one or more proteins 
that in the presence of the compound provide a detectable signal 
that indicates the presence, absence, or amount of said compound. 
A schematic of an example of a preferred embodiment of the method 
10 is shown in Fig. 13. In preferred embodiments, the compound is 
ascorbate and the detectable signal is selected from a group 
consisting of growth, fluorescence, luminescence, and color. In 
other preferred embodiments, the one or more genes comprises 
yiaJ, and preferably further comprises a promoter transcrip- 

15 tionally linked to a reporter gene. Preferably, YiaJ is 
naturally expressed in the cell, or the cell has been genetically 
manipulated to express YiaJ. Preferably the reporter gene has a 
promoter transcriptionally linked and the expression of the 
reporter gene is regulated by the binding of YiaJ to the 

20 promoter. The binding of YiaJ to the promoter is preferably 
regulated by the presence or absence of ascorbate. Preferably 
the cell is a bacteria, and most preferably Klebsiella oxytoca. 

An eleventh aspect of the invention features an isolated, 
purified, or enriched nucleic acid molecule encoding YiaJ and a 

25 reporter gene. Preferably, the nucleic acid molecule further 
comprises a promoter transcriptionally linked to a reporter gene. * 
Preferably the reporter gene is regulated by the binding of YiaJ 
to the promoter. The binding of YiaJ to the promoter is 
preferably regulated by the presence or absence of ascorbate. In 

30 preferred embodiments, the nucleic acid molecule further 
comprises a vector or promoter effective to initiate transcrip- 
tion in a host cell, 

A twelfth aspect of the invention features a recombinant 
cell comprising the nucleic acid molecule described in the 

35 eleventh aspect of the invention, above. 
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Preferred embodiments of this aspect of the invention 
feature a recombinant cell for detecting the presence, absence, 
or amount of a compound in a sample, where the cell expresses one 
or more genes encoding one or more proteins that in the presence 
of the compound provide a detectable signal, where the signal 
indicates the presence, absence, or amount of the compound. In 
preferred embodiments, the detectable signal is selected from a 
group consisting of growth, fluorescence, luminescence, and 
color . 

In other preferred embodiments of the recombinant cell for 
detecting the presence, absence, or amount of a compound in a 
sample, the one or more genes comprises ^yia J, and further 
comprises a promoter transcriptionally linked to a reporter gene. 
Preferably, the expression of the reporter gene is regulated by 
15 the binding of YiaJ to the promoter. Preferably, yiaJ is 
naturally expressed in the recombinant cell, or the cell has been 
genetically manipulated to express yiaJ. The recombinant cell is 
preferably a bacteria, and more preferably Klebsiella oxytoca. 

A thirteenth aspect of the invention features a method of 
selection for one or more nucleic acid sequences encoding a 
metabolic pathway from a source compound to a target compound 
comprising: (1) identifying an organism that metabolizes a target 
compound to provide an essential element; (2) identifying one or 
more genes responsible for the metabolism of the target compound 
25 to the essential element; (3) expressing the one or more genes 
under the control of an inducible promoter, whereby the target 
compound is metabolized only in the presence of an inducer and 
not in the absence of the inducer; (4) expressing nucleic acid 
sequences potentially encoding the metabolic pathway in the 
30 recipient organism; and (5) selecting the recipient organism for 
growth in the presence of the source compound in the absence of 
the target compound and in the presence of the inducer, where 
growth on the source compound in the absence of the target 
compound and in the presence of the inducer indicates the 
35 presence of the nucleic acid sequence. 
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In preferred embodiments of the method of selection, the 
essential element is selected from the group consisting of 
carbon, phosphorous, nitrogen, and sulfur, and most preferably is 
carbon. 

In other preferred embodiments, the method of selection 
further comprises the transfer of the one or more genes to a 
highly genetically manipulatable recipient organism, such that 
the recipient organism metabolizes the target compound to provide 
an essential element. 

By a "highly genetically manipulatable recipient organism" 
is meant an organism, preferably single-celled, more preferably 
bacteria, and most preferably Klebsiella ox^toca, that can be 
manipulated by the standard genetic techniques, including but not 
limited to, transfection, selection in selective media, growth in 
15 culture. 

The summary of the invention described above is not limiting 
and other features and advantages of the invention will be 
apparent from the following detailed description of the 
invention, and from the claims. 

20 Description Of The Figures 

Figure 1 shows a physical map of the yiaK-S operon, which 
includes the open reading frames yiaK, yiaL, orfl, yiaX2 , lyxK, 
yiaQ, yiaR, and yia, and its putative regulator, yiaJ, compared 
with the E. coli yiaK-S operon, which includes the open reading 

25 frames yiaK, yiaL, yiaM, yiaN, yiaO, lyxK, yiaQ, yiaR, and yiaS, 
and its putative regulator yiaJ. 

Figures 2A, 2B, 2C, 2D, 2E, and 2F show the nucleic acid 
sequence (SEQ ID NO: 19) and translated amino acid sequences of 
the open reading frames of the yia operon and its putative 

30 regulator, yiaJ. 

Figure 3 shows a multiple sequence alignment of YiaJ-Ko (SEQ 
ID NO: 10), YiaJ-Ec (SEQ ID NO:20), and YiaJ-Hi (SEQ ID N0:21). 
Identical sequences among the three proteins are indicated by 
shading . 
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Figure 4 shows a multiple sequence alignment of YiaK-Ko (SEQ 
ID NO: 11), YiaK-Ec (SEQ ID NO; 22), and YiaK-Hi (SEQ ID NO: 23). 
Identical sequences among the three proteins are indicated by 
shading. 

5 Figure 5 shows a multiple sequence alignment of YiaJ-Ko (SEQ 

ID NO: 12), YiaL-Ec (SEQ ID NO: 24), and YhcH-Hi (SEQ ID NO: 25). 
Identical sequences among the three proteins are indicated by 
shading. 

Figure 6 shows a multiple sequence alignment of LyxK-Ko (SEQ 
10 ID N0:15), LyxK-Ec (SEQ ID NO:26), and LyxK-Hi (SEQ ID NO:27). 
Identical sequences among the three proteins are indicated by 

shading, 4 

Figure 7 shows a multiple sequence alignment of YiaQ-Ko (SEQ 
ID NO: 16), YiaQ-Ec (SEQ ID NO: 28), and YiaQ-Hi (SEQ ID NO:29). 
15 Identical sequences among the three proteins are indicated by 
shading - 

Figure 8 shows a multiple sequence alignment of YiaR-Ko (SEQ 
ID N0:17), YiaR-Ec (SEQ ID NO:30), and YiaR-Hi (SEQ ID N0:31)- 
Identical sequences among the three proteins are indicated by 
20 shading. 

Figure 9 shows a multiple sequence alignment of YiaS-Ko (SEQ 
ID NO: 18), YiaS-Ec (SEQ ID NO: 32), and YiaS-Hi (SEQ ID NO: 33). 
Identical sequences among the three proteins are indicated by 
shading . 

25 Figure 10 shows a schematic of the construction of the 

Tester Strain. The plasmid pMG125 is shown which comprises: (i) 
a chloramphenicol resistance marker (cat); (ii) the thermosensi- 
tive origin of replication from plasmid pHOl (pHOl rep (t^) ) ; 
(iii) a 0.8 kb fragment containing the 5' region of the yiaJ gene 

30 and its promoter sequences; (iv) the spectinomycin resistance 
marker (spc); (v) the iacI^-iacO-trc promoter fragment; and (vi) 
a 1 kb fragment containing the 5' end of yiaK, including its 
ribosome binding site for translation initiation while excluding 
the promoter sequences of the yiaK-S operon. The recombinant 

35 plasmid pMG125 was introduced into K. oxytoca wild type strain 
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VJSK009 by transformation at 30 **C, the permissive temperature 
for pMAK705 replication. Chromosomal integration of the pMG125 
insert into VJSK009 was achieved by double crossover at the yiaJ- 
K locus such that the endogenous promoter of the yiaK~S operon 
5 was replaced with the inducible lacl'^-trc promoter system in the 
resulting recombinant cell, MGK003. 

Figure 11 shows a schematic representation of a general 
example of a metabolic selection process. Briefly, genetic 
material, isolated from microbes, is incorporated into a Tester 
10 Strain and the gene(s) of interest selected for by growth on "S". 
The gene(s) of interest will catalyze the conversion of "S" to 
"T" in the Tester Strain, thereby allowing growth on "S". 

Figure 12 shows a schematic representation of a more 
specific example of metabolic selection process, in which "S" is 
15 2-KLG and "T" is AsA. In this case, the gene{s) of interest are 
those that catalyze the conversion of 2-KLG to AsA. 

Figure 13, part A shows a theoretical model for AsA- 
dependent activation of the yiaK-S operon. Based on 

transcriptional analyses, the YiaJ regulatory protein is thought 
20 to activate transcription of the yiaK-S AsA catabolic operon in 
response to AsA present in the medium. However, the inventors do 
not wish to be held to this interpretation of the data. 

Figure 13, part B shows a schematic representation of a 
whole-cell reporter system for AsA sensing. The yiaK~S promoter 
25 region (PyiJ is fused to the Green-Fluorescent-Protein (GFP) gene 
(or to lux or other reporter genes), and the fusion is integrated 
into the chromosome of an indicator strain, which also contains 
the YiaJ regulator. In the presence of AsA, YiaJ is stimulated 
and activates transcription of the yia-GFP fusion, thereby 
30 conferring an easily detectable GFP-positive or fluorescent 
phenotype , 

Detailed Description Of The Invention 

The instant invention is based in part on the use of a 
metabolic selection strategy that uses a recombinant DNA 
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selection procedure to identify enzymatic pathways for the 
conversion of a source compound to a target compound. This 
technique allows at least a million-fold increase in the 
discovery rate over classical biochemical screening approaches, 
5 and allows testing of the 99% of the environmental microbes that 
are currently not able to be cultured in the laboratory. 

The general process involves the creation/ identification of 
an easily genetically-manipulatable organism containing an induc- 
ible signal, such that the signal is activated when a target 
10 compound is metabolized, followed by the screening of nucleic 
acid in this organism to identify genes which metabolize a source 
compound to the target compound (Figs. 11 and 12) 

In a specific embodiment, the process involves three steps 
(1) the identification of an organism capable of metabolizing the 
15 target compound to carbon and energy, and the transfer of this 
metabolic pathway to a highly genetically manipulatable organism, 
e.g. Escherichia coli or Bacillus subtilis,^ with the result that 
the recipient now uses the target compound for growth; (2) 
placing the expression of the pathway under the control of an 
20 inducible promoter, whereby the target compound is metabolized in 
the presence of an inducer and not in its absence; and (3) 
cloning genes, which are to be tested for their ability to 
metabolize the source compound, into the recipient, and selecting 
for growth on the source compound in the presence of the inducer 
25 but in the absence of the target compound. 

Once positive organisms are identified in the above 
selection scheme by growth in the presence of inducer, the 
organisms are further screened for their ability to grow in the 
absence of the inducer. No growth in the absence of the inducer 
30 indicates that the metabolism of the source compound proceeds via 
the target compound. Thus, the nucleic acid probably encodes an 
enzymatic pathway for the conversion of the source compound to 
the target compound. 

Growth in the absence of the inducer indicates that 
35 metabolism of the source compound to the essential element or 
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factor does not require prior conversion to the target compound, 
rather it may proceed directly, or through an intermediate, to 
the essential element or factor. When conversion directly to the 
target compound is the desired result, further work is necessary 
5 to obtain the desired genes. methods of obtaining the desired 
genes include: re-selection of DNA from other sources; random 
mutation of the DNA followed by re-selection; knocking out 
(deleting or blocking the expression of genes by methods well- 
known in the art) the genes that allow the direct conversion to 

10 the essential element or factor or from an intermediate to the 
essential element or factor followed by re-selection; etc. In 
one preferred embodiment, expression of the g^enes that allow the 
direct, or partially direct, conversion to the essential factor 
are knocked out or their expression blocked, thereby "forcing" 

15 the conversion to the essential element through the target 
compound. This will be effective if a pathway through the target 
compound existed, but was thermodynamically unfavorable, for 
example . 

Alternatively, if the intermediate is freely 
interconvertable with the desired target compound as well as to 
the essential element, growth in the absence of the inducer may 
be an acceptable outcome, or even desirable. By "freely 
interconvertable" is meant that an enzymatic pathway is present 
to allow the intermediate to be converted to the target. The 
25 interconvertability of the compounds would also be determined 
using the methods described above for obtaining a pathway 
directly to the target compound. 

Under some circumstances, selection of a pathway directly, 
or through an intermediate, to the essential element or factor 
rather than to the target compound, is a preferred result. For 
example, under circumstances where the desired target compound is 
not one that can be used for direct selection (e.g. does not 
cross membranes or is rapidly broken down) a "surrogate target" 
might have to be used. A surrogate target refers to one that is 
used for selection, but is not the most highly desired target. In 
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this embodiment, the target would preferably be on the pathway of 
conversion of the surrogate target to the essential element. 

I. Functional Derivatives 

Provided herein are functional derivatives of a polypeptide 
5 or nucleic acid of the invention. By "functional derivative" is 
meant a "chemical derivative," "fragment," or "variant," of the 
polypeptide or nucleic acid of the invention, which terms are 
defined below. A functional derivative retains at least a 
portion of the function of the protein, for example reactivity 

10 with an antibody specific for the protein, enzymatic activity or 
binding activity mediated through noncatalytic domains, which 
permits its utility in accordance with the present invention. It 
is well known in the art that due to the degeneracy of the 
genetic code numerous different nucleic acid sequences can code 

15 for the same amino acid sequence. Equally, it is also well 
known in the art that conservative changes in amino acid can be 
made to arrive at a protein or polypeptide which retains the 
functionality of the original. In both cases, all permutations 
are intended to be covered by this disclosure. 

20 Also included with "functional derivatives" of the 

polypeptides, in particular, of the invention are "chemical 
derivatives". A "chemical derivative" contains additional 
chemical moieties not normally a part of the protein. Covalent 
modifications of the protein or peptides are included within the 

25 scope of this invention. Such modifications may be introduced 
into the molecule by reacting targeted amino acid residues of the 
peptide with an organic derivatizing agent that is capable of 
reacting with selected side chains or terminal residues, for 
example, as described below. 

30 Cysteinyl residues most commonly are reacted with alpha- 

haloacetates (and corresponding amines) , such as chloroacetic 
acid or chloroacetamide, to give carboxymethyl or carboxy- 
amidomethyl derivatives. Cysteinyl residues also are derivatized 
by reaction with bromotrif luoroacetone, chloroacetyl phosphate, 
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N-alkylmaleimides, 3-nitro-2-pyridyl disulfide, methyl 2-pyridyl 
disulfide, p-chloromercuribenzoate, 2-chloromercuri-4-nitro- 
phenol, or chloro-7-nitrobenzo-2-oxa-l , B-diazole. 

Histidyl residues are derivatized by reaction with 
5 diethylprocarbonate at pH 5.5-7.0 because this agent is 
relatively specific for the histidyl side chain. Para- 
bromophenacyl bromide also is useful; the reaction is preferably 
performed in 0.1 M sodium cacodylate at pH 6.0. 

Lysinyl and amino terminal residues are reacted with 

10 succinic or other carboxylic acid anhydrides. Derivatization 
with these agents has the effect of reversing the charge of the 
lysinyl residues. Other suitable reagent^s for derivatizing 
primary amine containing residues include imidoesters such as 
methyl picolinimidate; pyridoxal phosphate; pyridoxal; chlorc- 

15 borohydride; trinitrobenzenesulf onic acid; 0-methylisourea; 2,4 
pentanedione; and transaminase-catalyzed reaction with 
glyoxylate. 

Arginyl residues are modified by reaction with one or 
several conventional reagents, among them phenylglyoxal, 2,3- 

20 butanedione, 1, 2-cyclohexanedione, and ninhydrin. Derivatization 
of arginine residues requires that the reaction be performed in 
alkaline conditions because of the high pKa of the guanidine 
functional group. Furthermore, these reagents may react with the 
groups of lysine as well as the arginine alpha-amino group. 

25 Tyrosyl residues are well-known targets of modification for 

introduction of spectral labels by reaction with aromatic 
diazonium compounds or tetranitromethane . Most commonly, N- 
acetylimidizol and tetranitromethane are used to form 0-acetyl 
tyrosyl species and 3-nitro derivatives, respectively. 

30 Carboxyl side groups (aspartyl or glutamyl) are selectively 

modified by reaction with carbodiimide (R' -N-C-N-R' ) such as 1- 
cyclohexyl-3- (2-morpholinyl (4-ethyl) carbodiimide or l-ethyl-3- 
(4-azonia-4 , 4-dimethylpentyl) carbodiimide. Furthermore, 
aspartyl and glutamyl residue are converted to asparaginyl and 

35 glutaminyl residues by reaction with ammonium ions. 
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Glutaminyl and asparaginyl residues are frequently 
deamidated to the corresponding glutamyl and aspartyl residues. 
Alternatively, these residues are deamidated under mildly acidic 
conditions. Either form of these residues falls within the scope 
5 of this invention. 

Derivatization with bifunctional agents is useful, for 
example, for cross-linking the component peptides of the protein 
to each other or to other proteins in a complex to a water- 
insoluble support matrix or to other macromolecular carriers. 
10 Commonly used cross-linking agents include, for example, 1,1- 
bis (diazoacetyl) -2-phenylethane, glutaraldehyde, N-hydroxy- 
succinimide esters, for example, esters with 4-a2idosalicylic 
acid, homobifunctional imidoesters, including disuccinimidyl 
esters such as 3, 3 ' -dithiobis (succinimidylpropionate) , and 
15 bifunctional maleimides such as bis-N-maleimlido-l, 8-octane . 

Derivatizing agents such as methyl-3- [p-azidophenyl) dithiol- 
propioimidate yield photo-activatable intermediates that are 
capable of forming crosslinks in the presence of light. 
Alternatively, reactive water-insoluble matrices such as cyanogen 
20 bromide-activated carbohydrates and the reactive substrates 
described in U.S. Patent Nos. 3,969,287; 3,691,016; 4,195,128; 
4,247,642; 4,229,537; and 4,330,440 are employed for protein 
immobilization. 

Other modifications include hydroxylation of proline and 
25 lysine, phosphorylation of hydroxyl groups of seryl or threonyl 
residues, methylation of the alpha-amino groups of lysine, 
arginine, and histidine side chains (Creighton, T.E., Proteins: 
Structure and Molecular Properties, W.H. Freeman & Co., San 
Francisco, pp. 79-86 (1983)), acetylation of the N-terminal 
30 amine, and, in some instances, amidation of the C-terminal 
carboxyl groups. 

Such derivatized moieties may improve the stability, 
solubility, absorption, biological half-life, and the like. The 
moieties may alternatively eliminate or attenuate any undesirable 
35 side effect of the protein complex and the- like. Moieties 
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capable of mediating such effects are disclosed, for example, in 
Remington's Pharmaceutical Sciences, 18th ed., Mack Publishing 
Co. , Easton, PA (1990) . 

The term "fragment" is used to indicate a polypeptide 
5 derived from the amino acid sequence of the proteins, of the 
complexes having a length less than the full-length polypeptide 
from which it has been derived. Such a fragment may, for 
example, be produced by proteolytic cleavage of the full-length 
protein. Preferably, the fragment is obtained recombinantly by 
10 appropriately modifying the DNA sequence encoding the proteins to 
delete one or more amino acids at one or more sites of the C- 
terminus, N-terminus, and/or within the ^native sequence. 
Fragments of a protein are useful for screening for compounds 
that act to modulate enzyme activity, as described herein. It is 

15 understood that such fragments may retain one or more 
characterizing portions of the native complex. Examples of such 
retained characteristics include: catalytic activity; substrate 
specificity; interaction with other molecules in the intact cell; 
regulatory functions; or binding with an antibody specific for 

20 the native complex, or an epitope thereof. 

Another functional derivative intended to be within the 
scope of the present invention is a "variant" polypeptide which 
either lacks one or more amino acids or contains additional or 
substituted amino acids relative to the native polypeptide. The 

25 variant may be derived from a naturally occurring complex 
component by appropriately modifying the protein DNA coding 
sequence to add, remove, and/or to modify codons for one or more 
amino acids at one or more sites of the C-terminus, N-terminus, 
and/or within the native sequence. It is understood that such 

30 variants having added, substituted and/or additional amino acids 
retain one or more characterizing portions of the native protein, 
as described above. 

A functional derivative of a protein with deleted, inserted 
and/or substituted amino acid residues may be prepared using 

35 standard techniques well-known to those of ordinary skill in the 
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art. For example, the modified components of the functional 
derivatives may be produced using site-directed mutagenesis 
techniques (as exemplified by Adelman et al., 1983, DNA 2:183) 
wherein nucleotides in the DNA coding the sequence are modified, 
and thereafter expressing this recombinant DNA in a prokaryotic 
or eukaryotic host cell, using techniques such as those described 
above. Alternatively, proteins with amino acid deletions, 
insertions and/or substitutions may be conveniently prepared by 
direct chemical synthesis, using methods well-known in the art. 
The functional derivatives of the proteins typically exhibit the 
same qualitative biological activity as the native proteins. 



II. Nucleic Acid Probes, Methods, and Kits for Detection of Yia 
operon-related polypeptides 

A nucleic acid probe of the present invention may be used to 
probe an appropriate chromosomal or cDNA library by usual 
hybridization methods to obtain other nucleic acid molecules of 
the present invention. A chromosomal DNA or cDNA library may be 
prepared from appropriate cells according to recognized methods 
in the art (cf. "Molecular Cloning: A Laboratory Manual", second 
edition, Cold Spring Harbor Laboratory, Sambrook, Fritsch, & 

Maniatis, eds., 1989). 

In the alternative, chemical synthesis can be carried out in 
order to obtain nucleic acid probes having nucleotide sequences 
which correspond to N-terminal and C-terminal portions of the 
amino acid sequence of the polypeptide of interest. The 
synthesized nucleic acid probes may be used as primers in a 
polymerase chain reaction (PCR) carried out in accordance with 
recognized PCR techniques, essentially according to PCR 
Protocols, "A Guide to Methods and Applications", Academic Press, 
Michael, et al. , eds., 1990, utilizing the appropriate chromo- 
somal or CDNA library to obtain the fragment of the present 
invention. 

One skilled in the art can readily design such probes based 
on the sequence disclosed herein using methods of computer 
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alignment and sequence analysis known in the art ("Molecular 
Cloning: A Laboratory Manual", 1989, supra). The hybridization 
probes of the present invention can be labeled by standard 
labeling techniques such as with a radiolabel, enzyme label, 
5 fluorescent label, biotin-avidin label, chemiluminescence, and 
the like. After hybridization, the probes may be visualized 
using known methods. 

The nucleic acid probes of the present invention include 
RNA, as well as DNA probes, such probes being generated using 
10 techniques known in the art. The nucleic acid probe may be 
immobilized on a solid support. Examples of such solid supports 
include, but are not limited to, plastics su^h as polycarbonate, 
complex carbohydrates such as agarose and sepharose, and acrylic 
resins, such as polyacrylamide and latex beads. Techniques for 
15 coupling nucleic acid probes to such solid supports are well 
known in the art . 

The test samples suitable for nucleic acid probing methods 
of the present invention include, for example, cells or nucleic 
acid extracts of cells, or biological fluids. The samples used 
20 in the above-described methods will vary based on the assay 
format, the detection method and the nature of the tissues, cells 
or extracts to be assayed. Methods for preparing nucleic acid 
extracts of cells are well known in the art and can be readily 
adapted in order to obtain a sample which is compatible with the 
25 method utilized. 

One method of detecting the presence of nucleic acids of the 
invention in a sample comprises (a) contacting said sample with 
the above -described nucleic acid probe under conditions such that 
hybridization occurs, and (b) detecting the presence of said 
30 probe bound to said nucleic acid molecule. One skilled in the 
art would select the nucleic acid probe according to techniques 
known in the art as described above. Samples to be tested 
include but should not be limited to RNA samples extracted from 
environmental samples . 
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A kit for detecting the presence of nucleic acids of the 
invention in a sample comprises at least one container means 
having disposed therein the above -described nucleic acid probe. 
The kit may further comprise other containers comprising one or 
5 more of the following: wash reagents and reagents capable of 
detecting the presence of bound nucleic acid probe. Examples of 
detection reagents include, but are not limited to radiolabelled 
probes, enzymatic labeled probes (horseradish peroxidase, 
alkaline phosphatase) , and affinity labeled probes (biotin, 

10 avidin, or steptavidin) . Preferably, the kit further comprises 
instructions for use. 

In detail, a compartmentalized kit includes any kit in which 
reagents are contained in separate containers. Such containers 
include small glass containers, plastic containers or strips of 

15 plastic or paper. Such containers allow the efficient transfer 
of reagents from one compartment to another compartment such that 
the samples and reagents are not cross -contaminated and the 
agents or solutions of each container can be added in a 
quantitative fashion from one compartment to another. Such 

20 containers will include a container which will accept the test 
sample, a container which contains the probe or primers used in 
the assay, containers which contain wash reagents (such as 
phosphate buffered saline, Tris -buffers, and the like) , and 
containers which contain the reagents used to detect the 

25 hybridized probe, bound antibody, amplified product, or the like. 
One skilled in the art will readily recognize that the nucleic 
acid probes described in the present invention can readily be 
incorporated into one of the established kit formats which are 
well known in the art . 

30 III. DNA Constructs Comprising Yia Qperon-Related Nucleic Acid 
Molecules and Cells Containing These Constructs. 

The present invention also relates to a recombinant DNA 
molecule comprising, 5' to 3 • , a promoter effective to initiate 
transcription in a host cell and the above -described nucleic acid 
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molecules. In addition, the present invention relates to a 
recombinant DNA molecule comprising a vector and an above- 
described nucleic acid molecule. The present invention also 
relates to a nucleic acid molecule comprising a transcriptional 
5 region functional in a cell, a sequence complementary to an RNA 
sequence encoding an amino acid sequence corresponding to the 
above -described polypeptide, and a transcriptional termination 
region functional in said cell. The above -described molecules 
may be isolated and/or purified DNA molecules. 
10 The present invention also relates to a cell or organism 

that contains an above -described nucleic acid molecule and 
thereby is capable of expressing a polypeptide. The polypeptide 
may be purified from cells which have been altered to express the 
polypeptide. A cell is said to be "altered to express a desired 
15 polypeptide" when the cell, through genetic manipulation, is made 
to produce a protein which it normally does not produce or which 
the cell normally produces at lower levels. One skilled in the 
art can readily adapt procedures for introducing and expressing 
either genomic, cDNA, or synthetic sequences into either 
20 eukaryotic or prokaryotic cells. 

A nucleic acid molecule, such as DNA, is said to be "capable 
of expressing" a polypeptide if it contains nucleotide sequences 
which contain transcriptional and translational regulatory 
information and such sequences are "operably linked" to 
2 5 nucleotide sequences which encode the polypeptide. An operable 
linkage is a linkage in which the regulatory DNA sequences and 
the DNA sequence sought to be expressed are connected in such a 
way as to permit gene sequence expression. The precise nature of 
the regulatory regions needed for gene sequence expression may 
30 vary from organism to organism, but shall in general include a 
promoter region which, in prokaryotes, contains both the promoter 
(which directs the initiation of RNA transcription) as well as 
the DNA sequences which, when transcribed into RNA, will signal 
synthesis initiation. Such regions will normally include those 
35 5^ -non-coding sequences involved with initiation of transcription 
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and translation, such as the TATA box, capping sequence, CAAT 
sequence , and the 1 ike . 

If desired, the non- coding region 3' to the sequence 
encoding a Yia operon polypeptide of the invention may be 
obtained by the above-described methods. This region may be 
retained for its transcriptional termination regulatory 
sequences, such as termination and polyadenylation. Thus, by 
retaining the 3 '-region naturally contiguous to the DNA sequence 
encoding a polypeptide of the invention, the transcriptional 
termination signals may be provided. Where the transcriptional 
termination signals are not satisfactorily functional in the 
expression host cell, then a 3> region functional in the host 
cell may be substituted. 

Two DNA sequences (such as a promoter region sequence and a 
15 sequence encoding a polypeptide of the invention) are said to be 
. operably linked if the nature of the linkage between the two DNA 
sequences does not (1) result in the introduction of a frame- 
shift mutation, (2) interfere with the ability of the promoter 
region sequence to direct the transcription of a gene sequence 
encoding a polypeptide of the invention, or (3) interfere with 
the ability of the gene sequence of a polypeptide of the 
invention to be transcribed by the promoter region sequence. 
Thus, a promoter region would be operably linked to a DNA 
sequence if the promoter were capable of effecting transcription 
of that DNA sequence. Thus, to express a gene encoding a 
polypeptide of the invention, transcriptional and translational 
signals recognized by an appropriate host are necessary. 

The present invention encompasses the expression of a gene 
encoding a polypeptide of the invention (or a functional 
derivative thereof) in either prokaryotic or eukaryotic cells. 
Prokaryotic hosts are, generally, very efficient and convenient 
for the production of recombinant proteins and are, therefore, 
one type of preferred expression system for polypeptides of the 
invention. Prokaryotes most frequently are represented by 
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various strains of E. coli. However, other microbial strains may 
also be used, including other bacterial strains. 

In prokaryotic systems, plasmid vectors that contain 
replication sites and control sequences derived from a species 
compatible with the host may be used. Examples of suitable 
plasmid vectors may include pBR322, pUC18, pUC19 and the like; 
suitable phage or bacteriophage vectors may include ygtlO, ygtll 
and the like; and suitable virus vectors may include pMAM-neo, 
pKRC and the like. Preferably, the selected vector of the 
present invention has the capacity to replicate in the selected 
host cell . 



E. 



Recognized prokaryotic hosts include Ijacteria such as 
coli, Bacillus, Streptomyces , Pseudomonas , Salmonella, Serratia, 
Klebsiella, and the like. The prokaryotic host must be 
15 compatible with the replicon and control sequences in the 
expression plasmid. 

To express a polypeptide of the invention (or a functional 
derivative thereof) in a prokaryotic cell, it is necessary to 
operably link the sequence encoding the polypeptide of the 
20 invention to a functional prokaryotic promoter. Such promoters 
may be either constitutive or, more preferably, regulatable 
(i.e., inducible or derepressible) . Examples of constitutive 
promoters include the int promoter of bacteriophage X, the Jbla 
promoter of the P-lactamase gene sequence of pBR322, and the cat 
25 promoter of the chloramphenicol acetyl transferase gene sequence 
of pPR325, and the like. Examples of inducible prokaryotic 
promoters include the major right and left promoters of 
bacteriophage X (P^ and PJ , the trp, recA, AacZ, AacI, and gal 
promoters of E. coli, the a-amylase (Ulmanen et al . , J. 
Bacterid. 162:176-182, 1985) and the q-28-specif ic promoters of 
B. suhtilis (Gilman et al . , Gene Sequence 32:11-20, 1984), the 
promoters of the bacteriophages of Bacillus (Gryczan, In: The 
Molecular Biology of the Bacilli, Academic Press, Inc., NY, 
1982), and Stjreptomyces promoters (Ward et al . , Mol . Gen. Genet. 
35 203:458-478, 1986). Prokaryotic promoters are reviewed by Glick 
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(Ind. Microbiot. 1:277-282. 1987), Cenatiempo (Biochimie 68:505- 
516, 1986), and Gottesman (Ann. Rev. Genet. 18:415-442, 1984). 

Proper expression in a prokaryotic cell also requires the 
presence of a ribosome -binding site upstream of the gene 
sequence -encoding sequence. Such ribosome -binding sites are 
disclosed, for example, by Gold et al. (Ann. Rev. Microbiol. 
35:365-404, 1981). The selection of control sequences, 
expression vectors, transformation methods, and the like, are 
dependent on the type of host cell used to express the gene. As 
used herein, "cell", "cell line", and "cell culture" may be used 
interchangeably and all such designations include progeny. Thus, 
the terms "transformants" or "transformed cells" include the 
primary subject cell and cultures derived therefrom, without 
regard to the number of transfers. It is also understood that 
15 all progeny may not be precisely identical in DNA content, due to 
deliberate or inadvertent mutations. However, as long as mutant 
progeny have the same functionality as that of the originally 
transformed cell, they are considered to be the same cell or 
cell-line . 

Host cells which may be used in the expression systems of 
the present invention are not strictly limited, provided that 
they are suitable for use in the expression of the polypeptide of 
interest. Transcriptional initiation regulatory signals may be 
selected which allow for repression or activation, so that 
expression of the gene sequences can be modulated. Of interest 
are regulatory signals which are temperature-sensitive so that by 
varying the temperature, expression can be repressed or 
initiated, or are subject to chemical (such as metabolite) 
regulation. 

A nucleic acid molecule encoding a polypeptide of the 
invention and an operably linked promoter may be introduced into 
a recipient prokaryotic or eukaryotic cell either as a nonrep- 
licating DNA or RNA molecule, which may either be a linear 
molecule or a closed covalent circular molecule. Alternatively, 
35 permanent expression may occur through the integration of the 
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introduced DNA sequence into the host chromosome or as a circular 
plasmid. 

A vector may be employed which is capable of integrating the 
desired gene sequences into the host cell chromosome. Cells 
which have stably integrated the introduced DNA ' into their 
chromosomes can be selected by also introducing one or more 
markers which allow for selection of host cells which contain the 
expression vector. The marker may provide for prototrophy to an 
auxotrophic host, biocide resistance, e.g., antibiotics, or heavy 
metals, such as copper, or the like. The selectable marker gene 
sequence can either be directly linked to the DNA gene sequences 
to be expressed, or introduced into the same cell by co- 
transfection. Additional elements may also be needed for optimal 
synthesis of mRNA. These elements may include splice signals, as 
15 well as transcription promoters, enhancers, and termination 
signals. cDNA expression vectors incorporating such elements 
include those described by Okayama (Mol . Cell. Biol. 3:280-289, 
1983) . 

The introduced nucleic acid molecule can be incorporated 
into a plasmid or viral vector capable of autonomous replication 
in the recipient host. Any of a wide variety of vectors may be 
employed for this purpose. Factors of importance in selecting a 
particular plasmid or viral vector include: the ease with which 
recipient cells that contain the vector may be recognized and 
selected from those recipient cells which do not contain the 
vector; the number of copies of the vector which are desired in 
a particular host; and whether it is desirable to be able to 
"shuttle" the vector between host cells of different species. 

Preferred prokaryotic vectors include plasmids such as those 
capable of replication in E. coli (such as, for example, pBR322, 
ColEl, pSClOl, pACYC 184, nVX; "Molecular Cloning: A Laboratory 
Manual", 1989, supra). Bacillus plasmids include pC194, pC221. 
pT127, and the like (Gryczan, In: The Molecular Biology of the 
Bacilli, Academic Press, NY, pp. 307-329, 1982). Suitable 
35 Streptomyces plasmids include plJlOl (Kendall et al . , J. 
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Bacteriol. 169:4177-4183, 1987), and streptomyces bacteriophages 
such as <j.C31 (Chater et al . , In: Sixth International Symposium on 
Actinomycetales Biology, Akademiai Kaido, Budapest, Hungary, pp. 
45-54, 1986). Pseudomonas plasmids are reviewed by John et al. 
5 (Rev. Infect. Dis. 8:693-704, 1986), and Izaki (Jpn. J. 
Bacteriol. 33:729-742, 1978). 

Once the vector or nucleic acid molecule containing the 
construct (s) has been prepared for expression, the DNA 
construct (s) may be introduced into an appropriate host cell by 
10 any of a variety of suitable means, i.e., transformation, 
transfection, conjugation, protoplast fusion, electroporation, 
particle gun technology, calcium phosphate-precipitation, direct 
microinjection, and the like. After the introduction of the 
vector, recipient cells are grown in a selective medium, which 
15 selects for the growth of vector- containing cells. Expression of 
the cloned gene(s) results in the production of a polypeptide of 
the invention, or fragments thereof. This can take place in the 
transformed cells as such, or following the induction of these 
cells to differentiate (for example, by administration of 
20 bromodeoxyuracil to neuroblastoma cells or the like). A variety 
of incubation conditions can be used to form the peptide of the 
present invention. The most preferred conditions are those which 
mimic physiological conditions. 

V. Antibodies, Hybridomas, Methods of Use and Kits for 

25 Detection of Yia Operon-Related polypeptides 

The present invention relates to an antibody having binding 
affinity to a polypeptide of the invention. The polypeptide may 
have the amino acid sequence set forth in SEQ ID NO: 10, SEQ ID 
NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 10, 

30 SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, or a functional 
derivative thereof, or at least 6 contiguous amino acids thereof 
(preferably, at least 15, 20, 25, 30, 35, or 40 contiguous amino 
acids thereof) . 
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The present invention also relates to an antibody having 
specific binding affinity to a polypeptide of the invention. 
Such an antibody may be isolated by comparing its binding 
affinity to a polypeptide of the invention with its binding 
5 affinity to other polypeptides. Those which bind selectively to 
a polypeptide of the invention would be chosen for use in methods 
requiring a distinction between a polypeptide of the invention 
and other polypeptides. Such methods could include, but should 
not be limited to, the identification of other cells expressing 
10 the polypeptides of the invention. 

The polypeptides of the present invention can be used in a 
variety of procedures and methods, such as for the generation of 
antibodies, for use in identifying pharmaceutical compositions, 
and for selection of other enzymmatic pathways. 
15 The polypeptides of the present invention can be used to 

produce antibodies or hybridomas . One skilled in the art will 
recognize that if an antibody is desired, such a peptide could be 
generated as described herein and used as an immunogen. The 
antibodies of the present invention include monoclonal and 
20 polyclonal antibodies, as well fragments of these antibodies. 

The present invention also relates to a hybridoma which 
produces the above -described monoclonal antibody, or binding 
fragment thereof. A hybridoma is an immortalized cell line which 
is capable of secreting a specific monoclonal antibody, 
2^ general, techniques for preparing monoclonal antibodies 

and hybridomas are well known in the art (Campbell, "Monoclonal 
Antibody Technology: Laboratory Techniques in Biochemistry and 
Molecular Biology," Elsevier Science Publishers, Amsterdam, The 
Netherlands, 1984; St. Groth et al., J. Immunol. Methods 35:1-21, 
30 1980). Any animal (mouse, rabbit, and the like) which is known 
to produce antibodies can be immunized with the selected 
polypeptide. Methods for immunization are well known in the art. 
Such methods include subcutaneous or intraperitoneal injection of 
the polypeptide. One skilled in the art will recognize that the 
35 amount of polypeptide used for immunization will vary based on 
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the animal which is immunized. the antigenicity of the 
polypeptide and the site of injection. 

The polypeptide may be modified or administered in an 
adjuvant in order to increase the peptide antigenicity. Methods 
5 of increasing the antigenicity of a polypeptide are well known in 
the art. Such procedures include coupling the antigen with a 
heterologous protein (such as globulin or p-galactosidase) or 
through the inclusion of an adjuvant during immunization. 

For monoclonal antibodies, spleen cells from the immunized 
10 animals are removed, fused with myeloma cells, such as SP2/0-Agl4 
myeloma cells, and allowed to become monoclonal antibody 
producing hybridoma cells. Any one of a number of methods well 
known in the art can be used to identify the hybridoma cell which 
produces an antibody with the desired characteristics. These 
15 include screening the hybridomas with an ELISA assay, western 
blot analysis, or radioimmunoassay (Lutz et al., Exp. Cell Res. 
175:109-124, 1988) . Hybridomas secreting the desired antibodies 
are cloned and the class and subclass are determined using 
procedures known in the art (Campbell, "Monoclonal Antibody 
20 Technology: Laboratory Techniques in Biochemistry and Molecular 
Biology", supra, 1984). 

For polyclonal antibodies, antibody- containing antisera is 
isolated from the immunized animal and is screened for the 
presence of antibodies with the desired specificity using one of 
25 the above -described procedures. The above-described antibodies 
may be detectably labeled. Antibodies can be detectably labeled 
through the use of radioisotopes, affinity labels (such as 
biotin, avidin, and the like), enzymatic labels (such as horse 
radish peroxidase, alkaline phosphatase, and the like) 
30 fluorescent labels (such as FITC or rhodamine, and the like), 
paramagnetic atoms, and the like. Procedures for accomplishing 
such labeling are well-known in the art, for example, see 
Stemberger et al., J. Histochem. Cytochem. 18:315, 1970; Bayer et 
al., Meth. Enzym. 62:308-, 1979; Engval et al . , Immunol. 
35 109:129-, 1972; Coding, J. Immunol ._Meth . 13:215-, 1976. The 
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labeled antibodies of the present invention can be used for in 
vitro, in vivo, and in situ assays to identify cells or tissues 
which express a specific peptide. 

The above -described antibodies may also be immobilized on a 
solid support. Examples of such solid supports include plastics 
such as polycarbonate, complex carbohydrates such as agarose and 
sepharose, acrylic resins such as polyacrylamide and latex beads. 
Techniques for coupling antibodies to such solid supports are 
well known in the art (Weir et al . , "Handbook of Experimental 
Immunology" 4th Ed., Blackwell Scientific Publications, Oxford, 
England, Chapter 10, 1986; Jacoby et al . , Meth. Enzym. 34, 
Academic Press, N.Y., 1974). The immobilized antibodies of the 
present invention can be used for in vitro, in vivo, and in situ 
assays as well as immuno- chromatography . 
15 Furthermore, one skilled in the art can readily adapt 

currently available procedures, as well as the techniques, 
methods and kits disclosed herein with regard to antibodies, to 
generate peptides capable of binding to a specific peptide 
sequence in order to generate rationally designed antipeptide 
20 peptides (Hurby et al., "Application of Synthetic Peptides: 
Antisense Peptides", In Synthetic Peptides, A User's Guide, W.H. 
Freeman, NY, pp. 289-307, 1992; Kaspczak et al. , Biochemistry 
28:9230-9238, 1989). 

Anti -peptide peptides can be generated by replacing the 
25 basic amino acid residues found in the peptide sequences of the 
Yia operon polypeptides of the invention with acidic residues, 
while maintaining hydrophobic and uncharged polar groups. For 
example, lysine, arginine, and/or histidine residues are replaced 
with aspartic acid or glutamic acid and glutamic acid residues 
30 are replaced by lysine, arginine or histidine. 

The present invention also encompasses a method of detecting 
a Yia operon-related polypeptide in a sample, comprising: (a) 
contacting the sample with an above -described antibody, under 
conditions such that immunocomplexes form, and (b) detecting the 
35 presence of said antibody bound to the polypeptide. In detail. 



BNSDOCID: <WO 0022170A1_IA> 



wo 00/22170 



PCTAJS99/23862 



10 



50 

the methods comprise incubating a test sample with one or more of 
the antibodies of the present invention and assaying whether the 
antibody binds to the test sample. Detection of a polypeptide of 
the invention in a sample may indicate the presence of the 
pathway of the invention in other cells. 

Conditions for incubating an antibody with a test sample 
vary. Incubation conditions depend on the format employed in the 
assay, the detection methods employed, and the type and nature of 
the antibody used in the assay. One skilled in the art will 
recognize that any one of the commonly available immunological 
assay formats (such as radioimmunoassays, enzyme-linked 
. immunosorbent assays, diffusion-based Ouchferlony. or rocket 
immunofluorescent assays) can readily be adapted to employ the 
antibodies of the present invention. Examples of such assays can 
15 be found in Chard ("An Introduction to Radioimmunoassay and 
Related Techniques" Elsevier Science Publishers, Amsterdam, The 
Netherlands, 1986), Bullock et al. ("Techniques in 
Immunocytochemistry, " Academic Press, Orlando, FL Vol. 1, 1982; 
Vol. 2, 1983; Vol. 3, 1985), Tijssen ("Practice and Theory of 
Enzyme Immunoassays: Laboratory Techniques in Biochemistry and 
Molecular Biology," Elsevier Science Publishers, Amsterdam, The 

Netherlands, 1985) . 

The immunological assay test samples of the present 
invention include cells, protein or membrane extracts of cells, 
or environmental samples. The test samples used in the above- 
described method will vary based on the assay format, nature of 
the detection method and the tissues, cells or extracts used as 
the sample to be assayed. Methods for preparing protein extracts 
or membrane extracts of cells are well known in the art and can 
readily be adapted in order to obtain a sample which is testable 
with the system utilized. 

A kit contains all the necessary reagents to carry out the 
previously described methods of detection. The kit may comprise: 
(i) a first container means containing an above-described 
35 antibody, and (ii) second container means containing a conjugate 
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comprising a binding partner of the antibody and a label. 
Preferably, the kit also contains instructions for use. In 
another preferred embodiment, the kit further comprises one or 
more other containers comprising one or more of the following: 
5 wash reagents and reagents capable of detecting the presence of 
bound antibodies. 

Examples of detection reagents include, but are not limited 
to, labeled secondary antibodies, or in the alternative, if the 
primary antibody is labeled, the chromophoric , enzymatic, or 
10 antibody binding reagents which are capable of reacting with the 
labeled antibody. The compartmentalized kit may be as described 
above for nucleic acid probe kits. One skilled in the art will 
readily recognize that the antibodies described in the present 
invention can readily be incorporated into one of the established 
15 kit formats which are well known in the art. 

Other methods associated with the invention are described in 
the examples disclosed herein. 

Examples 

The examples below are not limiting and are merely 
representative of various aspects and features of the present 
invention. The examples below demonstrate the construction and 
use of metabolic selection systems, and the isolation of desired 
enzymatic pathways . 

EXAMPLE 1: Construction of a Tester Strain for the Selection of 
Pathways from 2-KLG to AsA 

This example is exemplary of how to construct tester 
strains, and therefore can be applied to the identification and 
construction of tester strains for the selection of other 
metabolic pathways. The basic idea is to take environmental 
samples and test them for growth on a target compound (in the 
example, ascorbate) , Then, positive colonies are screened for 
the inability to grow on the source compound (in the example, 2- 
KLG) . The tester strain is the one that grows on the target, but 
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not the source compound. Once the genes encoding the metabolic 
pathway for the target compound to the essential factor (an 
element such as carbon, nitrogen, sulphur or phosphorous, or a 
nutrient, for example) are identified, they are then place under 
5 the control of an inducible promoter, and the tester strain is 
ready to be utilized to select for the metabolic pathway from the 
source to the target compound. 

If it proves difficult to obtain a tester strain that grows 
on the target, but not the source, but strains exist that do not 
10 grow on the source, then the pathway that permits growth on the 
target can be isolated and transferred to another strain that 
doesn't grow on the source in order to obtain the desired tester 
strain. 

Isolation of a Strain that Grows on Ash, but not 2-KLG 

15 Samples from diverse natural environments were collected to 

use for the isolation of microbes that can utilize ascorbic acid 
(AsA) as the sole carbon source. No bacterial species has 
previously been reported to grow on AsA minimal medium. 

Environmental samples were collected from freshwater lakes, 

20 lemon and orange orchards, residential backyard soils, human and 
animal solid wastes. 

Over 100 microbial isolates, capable of forming visible 
colonies within 20 hours of incubation at 30 °C on M9 minimal 
medium containing 0.5% AsA, were selected from these samples. 

25 These 100 isolates were then screened for their ability to grow 
on 2-Keto-L-Gulonate (2-KLG) minimal medium. 

One of the isolates that could utilize AsA as its sole 
source of carbon and energy, but could not grow on 2-KLG, was 
identified as Klebsiella oxytoca (Table 1). Thus, Klebsiella 

30 oxytoca was retained as a candidate for genetic engineering of a 
host strain that can use AsA under controlled conditions for the 
selection of cloned microbial pathways from 2-KLG to AsA. 
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Other bacterial strains capable of metabolizing ascorbic 
acid to carbon and energy were also identified, as were some that 
also metabolized 2KLG to carbon and energy (Table 1) , 



Table 1 



COMPOUND UTILIZATION OF ENVIRONMENTAL ISOLATES 

AsA 2-KLG 

GRAM POSITIVES 72 HR 24 HR 

Bacillus megaterium + + 

Streptomyces species ++ ++ 

Yellow Bug ++ +++ 

GRAM NEGATIVES 24 HR 72 HR 

Klebsiella pneumoniae +++ 

Klebsiella species +++ 

Klebsiella oxytoca +++ 

Unknown Malodorous ++ 
Short Rod 



Identification of Genes Responsible for AsA Cataboli 



sm 



In order to identify the gene(s) responsible for AsA 
catabolism in K. oxytoca, mutagenesis by transposition insertion 
was performed in K. oxytoca strain VJSK009 (Call, B. M., et aJ., 
1989. J. Bacterid. 171:2666-2672) using the pfd-Tn5 delivery 
vector as described by Metzger, M., et ai., 1992. Nucl. Acids 
Res, 20:2265-2270. Among 5,000 clones screened, several mutants 
that were no longer capable of growing on AsA were identified, 
15 most of which were also affected in their ability to grow on 
conventional carbon sources such as glucose, maltose, pyruvate or 
succinate. Two of the mutants, however, were specifically 
affected in AsA utilization and were further characterized by 
cloning and sequencing the regions adjacent to the transposon 
20 insertion. 
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Characterization of the Genes/Proteins of the Operon 

In both mutants, the Tn5 insertion was found to disrupt the 
same operon of 8 genes. This operon was found to be homologous 
to the yiaK-S operon of E. coli (Blattner, F. R., et al., 1997. 
Science 277:1453-1462) which is thought to be involved with 
carbohydrate utilization (Badia, J., et al., 1998. J. Biol. Chem. 
273:8376-8381) . 

Similarly to E. coli, the K. oxytoca yiaK-S operon is 
preceded by a transcriptional regulator, yiaJ. A physical map of 
the yiaK-S operon and its putative regulator is shown in Figure 
1. The nucleic acid sequence and translated amino acid sequence 
of the open reading frames of the operoji and its putative 
regulator are shown in Figure 2 A-F. 

The functions of the yia operon gene products in K. oxytoca 
and E. coli are unknown, except for the E. coli iyxK-encoded 
enzyme which was shown to phosphorylate L-xylulose and play a key 
role in the utilization of L-lyxose by E. coli (Sanchez, J. C, 
et al., 1994. J- Biol. Chem. 169:29665-29669). However, the 
yiaK-S operon is thought to be silent in wild-type E. coli, L- 
xylulose activity could not be detected in wild type cells, and 
E. coli K12 is unable to metabolize L-lyxose (Sanchez, J. C, et 
al., 1994. supra). A similar operon is also present in 
Haemophilus influenzae, but no function has been determined for 
any of the open reading frames ( Fleischmann, R.D., et al., 1995. 
Science 269:496-512) . 

Alignments of the yia open reading frames common among the 
three species are shown (Figs. 3-9). Based on sequence 
similarities, yiaQ has been classified as a putative hexulose-6- 
phosphate synthase, yiaR as a putative hexulose-6-phosphate 
isomerase,- and yiaS as a putative sugar isomerase (data not 
shown) . 

Place Operon under the control of an Induci ble Promoter 

To engineer K. oxytoca as a host strain for the selection of 
biocatalysts which produce AsA, the promoter of the yiaK-S operon 
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was replaced with a DNA fragment that contained the trp-lac 
hybrid promoter of transcription, the JacO operator, and the 
lacl"^ repressor gene (Brosius, J. 1992. Meth. Enzymol. 216:469- 
483). This allows the yiaK-S operon, and therefore AsA 
catabolism, to be turned ON and OFF in a tightly controlled 
manner in the presence or absence of IPTG, a non-metabolizable 
inducer of the lac promoter. Practically, a 5-way ligation was 
set up among: (i) the pMAK705 integration vector which carries a 
chloramphenicol resistance marker and the thermosensitive origin 
of replication from plasmid pHOl (Hamilton, C. M., et al., 1989. 
J. Bacterid. 171:4 617-4 622); (ii) a 0.8 kb fragment containing 
the 5' region of the yiaJ gene and its promoter sequences; (iii) 
the spectinomycin resistance marker retrieved from Staphylococcus 
aureus Tn554 (Murphy, E. 1985. Mol. Gen. Genet. 200:33-39) to 
15 follow integration events; (iv) the lacI^-lacO-trc promoter 
fragment retrieved from pSE380 (InVitrogen, Carlsbad, CA) ; and 
(V) a 1 kb fragment containing the 5' end of yiaK, including its 
ribosoiPe binding site for translation initiation while excluding 
the promoter sequences of the yiaK-S operon (Figure 10). 

The recombinant plasmid, pMG125, was introduced into K. 
oxytoca wild type strain VJSK009 by transformation at 30 °C, the 
permissive temperature for pMAK705 replication. Chromosomal 
integration of the pMG125 insert by double crossover at the yiaJ- 
K locus was achieved by successive temperature switches as 
25 described by (Hamilton, C. M., et al., 1989. supra). PCR 
analyses were performed on 12 candidates to verify that the 
endogenous promoter of the yiaK-S operon had been replaced with 
the inducible lacl'^-trc promoter system (Figure 10). 

The resulting strain, MGK003, proved able to grow on M9 
30 minimal medium supplemented with AsA 0.25% and IPTG 10 to 100 pM, 
while no growth was observed on the same medium lacking IPTG. 

EXAMPLE 2: Preparation of Environmental DNA Libraries 

An example of a currently preferred method for the isolation 
of DNA from environmental samples is provided below. In the 
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example, purification from soil and water samples are described, 
however samples can be from any environmental source and the 
methods adapted according to practices well-known in the art. 

Direct Isolation of Total DNA from Soil and Water Samples 
5 Total microbial DNA was isolated from various soil and water 

samples according to the following procedure which is derived and 
modified from Steffan, R.J., et al., 1988. Appl . Environ. 
Microbiol. 54:2908-2915; Whatling, C. A., and C. M. Thomas. 1993. 
Anal. Biochem. 210:98-101; and Zhou, J., et al., 1996. Appl. 
10 Environ. Microbiol. 62:316-322. 

1. Begin with 100 g wet soil or 50 g dry soil; 

150 mL sodium phosphate buffer 0.1 M, pH 4.5; and 5 g 
PVPP (acid washed) . 

2. Blender - medium speed - 3 times for 1 min (cool down 
15 between each cycle). Add 0.5 mL SDS 20%, blend 5 more 

seconds . 

3. Centrifuge 10 min at 1,000 g at 10 "C. 

4. Keep supernatant. Repeat extraction twice with soil 
pellet. 

20 5. Combine the 3 supernatants . Centrifuge 20 min at 

10,000 g at 10 '^C 
6. Wash pellet with cold 0.1% sodium-0.1% sodium 
pyrophosphate. Homogenize with blender for 1 min or 
shake. Centrifuge 20 min at 10,000 g at 10 °C. 

25 7. Wash pellet with 33 mM Tris-HCl, 1 mM EDTA, pH 8.0. 

8. Resuspend in 2 mL 10 mM Tris, pH 7.6; IN NaCl. 

9. Mix with equal volume 1.2% LMP agarose at 42 °C. Pour 
into 1 mL syringes. Polymerize for 20 min at 4 °C. 

10. Incubate 3-4 hours at 37 "C in 20 vol. IN NaCl; 100 mM 
30 EDTA; 10 mM Tris, pH 7.5; 1% sarkosyl; 1 mg/mL 

lysozyme. 

11. Add 1 mg/mL proteinase K. Incubate overnight at 45 °C. 

12. Wash agarose plugs twice with TE. Store in 100 mM 
EDTA; 10 mM Tris at 4 °C. 
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13. Load noodles on LMP agarose gel 0.7%. Cut out chromo- 
somal band. Heat 15 min at 65 ''C in TE buffer. Add 2 
U GelZyme (InVitrogen) per 200 pL 1% agarose. Incubate 
for 2 h at 40 EtOH precipitate for no more than 30 

min at -20 ""C. 

Preparation of Total DNA from Post-Enrichment Cultures 

Aliquots from 18 water or soil samples were used to 
inoculate 50 mL of M9 minimal medium supplemented with any one of 
the following carbon sources: 0.5% 2-ia.G; 0.25% L-idonate (L-IA) ; 
0.25% L-gulonate (L-GuA) and 0.25% ascorbate. Culture flasks 
were incubated for 2 to 3 days at 30 ^C without agitation. 
Total DNA was isolated from these cultures as follows: 

1. 20 mL were centrifuged for 5 min at 6,000 rpm. 

2. Pellets were washed with 5 mL Tris 10 mM, EDTA 1 mM pH 
15 8.0 (TE) , were centrifuged again, and were resuspended in 0.9 mL 

TE. 

3. Lysozyme (5 mg/mL) and RNase 100 (pg/mL) were added, 
and cells were incubated for 10 min at 37 °C. 

4. Sodium dodecylsulf ate (SDS) was added to a final 
20 concentration of 1%, and the tubes were gently shaken until lysis 

was completed. 

5. 200 mL of a 5 N NaC104 stock solution were added to the 
lysate . 

6. The mixture was extracted once with one volume of 
25 phenol: chloroform (1:1) and once with one volume of chloroform. 

7. Chromosomal DNA was precipitated by adding 2 mL of cold 
(-20 ^C) ethanol and gently coiling the precipitate around a 
curved Pasteur pipette. 

8. DNA was dried for 30 min at room temperature and was 
resuspended in 100 to 500 pL of Tris 10 mM, EDTA 1 mM, NaCl 50 mM 
pH 8.0 to obtain a DNA concentration of 0.5 to 1 yg/pL. 
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EXAMPLE 3: Selection for Nucleic Acid which C onverts 2-KLG to 
AsA (Fig. 12) 

This example is exemplary of how to select for nucleic acid 
sequences that encode metabolic pathways, and therefore can be 
applied to the identification and selection of sequences encoding 
other metabolic pathways. Basically, a nucleic acid library is 
made, according to methods well-known in the art, from nucleic 
acid sequences isolated from environmental samples (as described 
in Example 2, for example). This library is then transfected 
into the tester strain and the resulting pool of transfected 
cells selected for growth on the source compound (2-KLG in the 
example) in the absence of the target compound (ascorbate in the 
example) and the presence of the inducer. 

Construction of an Enrichment DNA Librar y in a Cosmid Vector 

The SuperCosl cosmid vector (Stratagene, La Jolla, CA) is a 
X-based cloning system suitable for the cloning of large DNA 
fragments. After treatment according to the manufacturer's 
instructions, the 8 kb-long vector appears as two arms flanked by 
cos sites which are recognized by the X-packaging machinery. 
Since only DNA molecules from 40 to 48 kb are efficiently 
packaged in Jl-heads, this allows the selective cloning of 32 to 
40 kb inserts between the two arms. 

Chromosomal DNA extracted from 20 post-enrichment cultures 
was mixed in equal amounts. Five to ten \ig of the mixture were 
partially digested with Sau3A restriction enzyme to obtain DNA 
fragments sized between 5 and 50 kb, were dephosphorylated, and 
were ligated with SuperCosl arms using conditions recommended by 
the supplier. One ^g of the ligation mixture was used in an in 
vitro packaging reaction using the Gigapack III Gold packaging 
kit from Stratagene to create the cosmid library. 

Clearly, this procedure can be used to make other 
chromosomal DNA libraries, for example from other enriched 
environmental samples, or from chromosomal DNA extracted directly 
from environmental samples. 
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Transfection and Selection of the Cosmid Library 

Prior to transfection of K. oxytoca strain MGKOOB with the 
packaging mixture, the tester strain was transformed with plasmid 
pCB382 expressing the E. coli lamB gene that functions as X 
5 receptor, which appears to be absent or non-functional in most 
Klebsiella strains (De Vries, G. E., et ai,, 1984. Proc. Natl, 
Acad. Sci. USA 81:6080-6084). The resulting MGK003 [X^] strain 
was transfected with the packaged products as follows: 

1. Five mL of liquid LB medium supplemented with 0.2% 
10 maltose and 10 mM MgSO^ were inoculated from an overnight 

preculture of strain MGK003 [pCB382] . 

2. Cells were grown to an ODeoo of 0.5,^ were centrifuged at 
500 xg for 10 min, and were resuspended in the same volume of 10 
mM MgS04. 

15 3. The packaging products were mixed with 2 mL of cells in 

15 mL culture tubes, and were incubated for 20 min at 39 °C 
without shaking, 

4. After adding 2.5 mL of 2x YT (1% NaCl; 1% yeast 
extract; 1,6% tryptone) , cells were incubated at 37 °C for 1 h 

20 under gentle agitation. 

5. A 100 pL-aliquot was plated on LB-kanamycin medium to 
determine the number of clones present in the cosmid library. 

6. The remainder was centrifuged at 3000 g for 5 min and 
was resuspended in 1 mL of M9 minimal medium supplemented with 10 

25 pM IPTG (IPTG concentration can be varied up to 100 pM) , and 
aliquots (200 pL) were plated on M9 plates containing 0.5% 2-KLG 
and 50 pM IPTG. 7. Plates were incubated at 37 °C for 36 h 

for selecting candidate pathways that would convert 2-KLG to AsA. 
(Alternatively, selection can be done at 30 °C, ) 

30 Among 500,000 clones to which a first selection round was 

applied, approximately 100 colonies of various sizes appeared on 
2-KLG/IPTG plates. These were re-streaked on: (i) LB-kanamycin 
to verify the presence of the cosmid vector; (ii) 2-KLG/IPTG; and 
(iii) 2-KLG lacking IPTG to determine if growth of the positive 
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clones on 2-KLG was dependent upon the expression of AsA 
catabolism. 

Two clones were retained that grew on LB-kanamycin and 2- 
KLG/IPTG, but not on 2-KLG without IPTG within 20 h at 37 "C. To 
verify that the observed phenotype was conferred by the cloned 
DNA, cosmid DNA was extracted from these two clones and 
introduced, by electroporation, into strain MGK003. In both 
cases, the back-cross gave a phenotype identical to that of the 
original clone obtained in the selection process (Data not 
shown) . 

Selection of libraries can also be done on other carbon 
sources to isolate other pathways, for example on L-gulonate 
(0.25%) plus IPTG to isolate pathways from L-gulonate to AsA, or 
on L-idonate (0.25%) plus IPTG to isolate pathways from L-idonate 
to AsA. 

EXAMPLE 4: Isolation of Other Pathways 

The metabolic selection strategy described above can also be 
used for the isolation of other pathways of interest, for example 
from 2-KLG to L-idonate, or 2-KLG to L-gulonate, or 
alternatively, to identify new reductase enzymes capable of the 
conversion of 2,5-DKG to 2-KLG. This conversion is one of the 
slow steps in the production of ascorbate, so identification of 
an enzymatic method would be economically useful. Basically, the 
strategy described in the examples above can be used to isolate 
any pathway to metabolize a compound as a carbon, nitrogen, 
sulfur, or potentially, a phosphorous source. 

EXAMPLE 5: Directed Evolution of Enzymes 

This metabolic selection method is also capable of 
facilitating the directed evolution of enzymes. One can use this 
technique to screen known enzymes for mutations leading to higher 
efficiency, or to better specify optimal temperature or cofactor 
requirements, in the metabolic utilization of a compound. The 
mutations can be the result of natural evolution, the result of 
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PGR or chemical mutagenesis, or created through techniques like 
DNA shuffling. 

EXAMPLE 6: Glucose to Ascorbic Acid Directly 

Another permutation on this strategy that can be envisioned 
5 is to find new pathways for already existing processes, e.g. 
selection for a new pathway for the conversion of glucose to 
ascorbic acid using only a few enzymatic steps. This is feasible 
using, for example, a strain for which the sequence of the entire 
genome is known, such as E. coll or B. subtills. The genes for 
10 the metabolism of glucose can be mutagenized such that the strain 
can no longer use glucose as a carbon/energy source, and then 
glucose-utilization pathways can be selected for as described in 
the previous examples. 

EXAMPLE 7: Ascorbate Biosensor (Fig. 13) 

15 As mentioned above, the ylaJ protein is thought to be a 

regulator for the Yia operon. The experiments of the invention 
indicate that the regulatory activity of YiaJ may be, in part, 
modulated by sensing ascorbate. Thus, it is currently believed 
that the '^sensing" of ascorbate by YiaJ (perhaps through binding, 

20 although the authors do not wish to be restricted to this 
interpretation) leads to the activation of the Yia operon, and 
thus the use of ascorbate as a carbon/energy source. This 
potentially results in an extremely sensitive "'biosensor" for 
ascorbate. Thus, for example, it is envisioned that yia J could 

25 be placed in a construct such that when YiaJ bound ascorbate a 
detectable signal resulted, i.e. instead of turning ''ON'' or "OFF'' 
the Yia operon, YiaJ could turn "ON" or "OFF" a gene which 
produces a detectable signal, for example a gene for fluorescence 
(e.g. ^-galactosidase) , luminescence (e.g. luciferase) , or color 

30 (lac operon, or green flourescent protein) . Methods of 
constructing these signal constructs are well-known in the art 
(e.g. Simpson, et al. 1998. TIBTECH 16: 332-338; Applegate, et 
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al. 1998. Applied Environ. Microbiol. 64: 2730-2735; Selifonova 
and Eaton, 1996. Applied Environ. Microbiol. 62: 778-783). . 

These biosensor constructs can also be used in the methods 
of the invention for screening for a metabolic selection pathway 
instead of using selection on an essential factor or element. In 
this case, the tester strain would be one that does not have the 
source to target pathway as determined by the absence of target 
being detected by the biosensor in the presence or the absence of 
the source compound. Thus, the biosensor would need to "sense" 
and to "react to" the presence of the target compound by any one 
of the methods described above. Following transfection of the 
library of nucleic acid from environmental sources, the resulting 
cells would be screened for the presence of the target compound 
using the biosensor. In order to facilitate the numbers of 
15 colonies that would need to be screened, this could be automated 
read in luminescent or flourescent readers or sorted by FACS 
prior to further testing and identification of individual 
colonies. Although this requires more initial screening than 
selection using an essential element, this method offers an 
alternative approach when the appropriate tester strain or the 
metabolic pathway is not available for screening using an 
essential factor. Thus, the biosensor method provides the 
flexibility to identify pathways for compounds that are not 
metabolizable to an essential element, factor, or nutrient, but 
25 can be any compound for which a "biosensor" can be identified. 
Biosensors can be identified and created as described above. 

One skilled in the art would readily appreciate that the 
present invention is well adapted to carry out the objects and 
obtain the ends and advantages mentioned, as well as those 
inherent therein. The molecular complexes and the methods, 
procedures, treatments, molecules, specific compounds described 
herein are presently representative of preferred embodiments are 
exemplary and are not intended as limitations on the scope of the 
invention. Changes therein and other uses will occur to those 
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skilled in the art which are encompassed within the spirit of the 
invention are defined by the scope of the claims. 

It will be readily apparent to one skilled in the art that 
varying substitutions and modifications may be made to the 
5 invention disclosed herein without departing from the scope and 
spirit of the invention. 

All patents and publications mentioned in the specification 
are indicative of the levels of those skilled in the art to which 
the invention pertains. 

10 The invention illustratively described herein suitably may 

be practiced in the absence of any element or elements, 
limitation or limitations which is not specifically disclosed 
herein. Thus, for example, in each instance herein any of the 
terms '^comprising", "consisting essentially of" and '^consisting 

15 of" may be replaced with either of the other two terms. The 
terms and expressions which have been employed are used as terms 
of description and not of limitation, and there is no intention 
that in the use of such terms and expressions of excluding any 
equivalents of the features shown and described or portions 

20 thereof, but it is recognized that various modifications are 
possible within the scope of the invention claimed. 

In addition, where features or aspects of the invention are 
described in terms of Markush groups, those skilled in the art 
will recognize that the invention is also thereby described in 

25 terms of any individual member or subgroup of members of the 
Markush group. For example, if X is described as selected from 
the group consisting of bromine, chlorine, and iodine, claims for 
X being bromine and claims for X being bromine and chlorine are 
fully described, 

30 Other embodiments are within the following claims. 
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Claims 

1. A method for screening for one or more nucleic acid 
sequences that express one or more products that convert a source 
compound into a target compound, comprising contacting a cell 
with one or more test nucleic acid sequences, wherein said cell 
expresses one or more genes encoding one or more proteins that in 
the presence of said target compound provide a detectable signal, 
wherein said detectable signal indicates the presence of said one 
or more nucleic acid sequences. 

2. The method of claim 1, wherein said^one or more nucleic 
acid sequences encodes a metabolic pathway not normally present 
in said cell. 

3. The method of claim 2, wherein said one or more nucleic 
acid sequences are selected from the group consisting of 
mutagenized DNA, environmental DNA, combinatorial libraries, and 
recombinant DNA. 

4. The method of claim 3, wherein said environmental DNA 
is isolated from one or more sources selected from the group 
consisting of mud, soil, water, sewage, flood control channels, 
and sand. 

5. The method of claim 3, wherein said mutagenized DNA is 
the result of enzyme mutagenesis wherein said mutagenesis is 
selected from the group consisting of random, chemical, PCR- 
based, and directed mutagenesis. 

6. The method of claim 5, wherein said enzyme is selected 
from the group consisting of lactonases, esterhydrolases , and 
reductases . 
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7. The method of claim 1, wherein said detectable signal 
is selected from a group consisting of growth, fluorescence, 
luminescence, and color. 

8. The method of claim 7, wherein said detectable signal 
5 is growth. 

9. The method of claim 1, wherein said target compound 
provides an element required for growth. 

10. The method of claim 9, wherein said element is selected 
from the group consisting of carbon, nitrogen, sulfur, and 

10 phosphorous. 

11. The method of claim 10, wherein said element is carbon. 

12. The method of claim 9, wherein said target compound is 
selected from the group consisting of ascorbate and 2-KLG. 

13. The method of claim 12, wherein said target compound is 
15 ascorbate. 

14. The method of claim 1, wherein said source compound is 
selected from the group consisting of 2-Keto-L-Gulonate, 2,5- 
Deoxy-Keto-Gulonate, L-Idonate, L-Gulonate, and glucose. 

15. The method of claim 14, wherein said source compound is 
20 2-Keto-L-Gulonate . 

16. The method of claim 1, wherein said cell naturally 
expresses said one or more genes encoding said one or more 
proteins that in the presence of said target compound provide a 
detectable signal . 



BNSDOCtD:<WO 0022170A1 IA> 



wo 00/22170 



PCT/US99/23862 



66 

17. The method of claim 16, wherein said one or more 
proteins are one or more Yia operon-related polypeptides. 

18. The method of claim 1, wherein said cell has been 
genetically manipulated to express said one or more genes 

5 encoding one or more proteins that in the presence of said target 
compound provide a detectable signal. 

19. The method of claim 18, wherein said one or more 
proteins are one or more Yia operon-related polypeptides. 

20. The method of claim 18, wherein said one or more genes 
10 encoding said one or more proteins are under the control of an 

inducible promoter. 

21. The method of claim 20, wherein said inducible promoter 
comprises the trp-lac hybrid promoter, the lacO operator, and the 
lad'' repressor gene. 



15 



22. The method of claim 1, wherein said cell grows on 
ascorbate and does not grow on 2-Keto-L-Gulonate . 

23. The method of claim 22, wherein said cell is a 
bacteria . 

24. The method of claim 23, wherein said bacteria is 
20 Klebsiella oxytoca . 

25. The method of claim 1, wherein said cell grows on 2- 
Keto-L-Gulonate and does not grow on 2, 5-Deoxy-Keto-Gulonate . 

26. An isolated, enriched, or purified nucleic acid 
molecule encoding one or more Yia operon-related polypeptides 

25 selected from the group consisting of YiaJ, YiaK, YiaL, ORFl, 
YiaX2, LyxK, YiaQ, YiaR, and YiaS. 
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27. The nucleic acid molecule of claim 26, wherein said 
nucleic acid molecule comprises a nucleotide sequence that: 

(a) encodes a polypeptide having the full length amino acid 
sequence set forth in SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, 

5 SEQ ID N0:13, SEQ ID N0:14, SEQ ID N0:15, SEQ ID n6:16, SEQ ID 
N0:17, or SEQ ID N0:18; 

(b) is the complement of the nucleotide sequence of (a) ; 

and 

(c) hybridizes under highly stringent conditions to the 
10 nucleotide molecule of (a) and encodes a naturally occurring 

polypeptide . 

28. The nucleic acid molecule of claim 26, further 
comprising a vector or promoter effective to initiate 
transcription in a host cell. 

15 29. The nucleic acid molecule of claim 26, wherein said 

nucleic acid molecule is isolated, enriched, or purified from a 
bacteria . 

30. The nucleic acid molecule of claim 29, wherein said 
bacteria is Klebsiella oxytoca. 

20 31. A nucleic acid probe for the detection of nucleic acid 

encoding one or more Yia operon-related polypeptides, selected 
from the group consisting of YiaJ, YiaK, YiaL, ORFl, YiaX2, LyxK, 
YiaQ, YiaR, and YiaS, in a sample. 

32, The probe of claim 31, wherein said polypeptide is a 
25 fragment of the protein encoded by the full length amino acid 
sequence set forth in SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, 
SEQ ID N0:13, SEQ ID N0:14, SEQ ID N0:15, SEQ ID N0:16, SEQ ID 
N0:17, or SEQ ID N0:18. 
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33. A recombinant cell comprising a nucleic acid molecule 
encoding one or more Yia operon-related polypeptides selected 
from the group consisting of Yia J, YiaK, YiaL, ORFl, YiaX2, LyxK, 
YiaQ, YiaR, and YiaS. 

5 34. The cell of claim 33, wherein said polypeptide is a 

fragment of the protein encoded by the amino acid sequence set 
forth in SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, 
SEQ ID N0:14, SEQ ID N0:15, SEQ ID N0:16, SEQ ID N0:17, or SEQ ID 
N0:18. 

10 35. An isolated, enriched, or purified^ Yia operon-related 

polypeptide selected from the group consisting of YiaJ, YiaK, 
YiaL, ORFl, YiaX2, LyxK, YiaQ, YiaR, and YiaS. 

36. The polypeptide of claim 35, wherein said polypeptide 
is a fragment of the protein encoded by the full length amino 

15 acid sequence set forth in SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID 
NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, 
SEQ ID NO: 17, or SEQ ID NO: 18. 

37. The polypeptide of claim 35, wherein said polypeptide 
is isolated, enriched, or purified from bacteria. 

20 38. The nucleic acid molecule of claim 37, wherein said 

bacteria is Klejbsiella oxytoca. 



39. An isolated, enriched, or purified nucleic acid 
nucleotide sequence set forth in SEQ ID NO: 19. 



molecule, wherein said nucleic acid molecule comprises the 
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40. The nucleic acid molecule of claim 39, wherein said 
nucleic acid molecule comprises: 



BNSOOCIDkWO .00221 70A1 IA> 



wo 00/22170 PCT/LIS99/23862 



69 

(a) one or more nucleotide sequences that are set forth in 
SEQ ID N0:1, SEQ ID N0:2, SEQ ID N0:3, SEQ ID N0:4, SEQ ID N0:5, 
SEQ ID N0:6, SEQ ID N0:7, SEQ ID N0:8, or SEQ ID N0:9; 

(b) the complement of the nucleotide sequence of (a) ; 

5 (c) nucleic acid that hybridizes under stringent conditions 

to the nucleotide molecule of (a) ; 

(d) the full length sequence of SEQ ID NO: 19, except that 
it lacks one or more of the sequences set forth in SEQ ID N0:1, 
SEQ ID N0:2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, 

10 SEQ ID N0:7, SEQ ID NO : 8 , or SEQ ID N0:9; and 

(e) the complement of the nucleotide sequence of (d) . 

41. The nucleic acid molecule of either^of claims 39 or 40, 
further comprising a vector or promoter effective to initiate 
transcription in a host cell. 

1^ ^2. The nucleic acid molecule of claim 41, wherein said 

vector or promoter comprises the trp-lac hybrid promoter, the 
lacO operator, and the lacT repressor gene. 

43. The nucleic acid molecule of claim 39, wherein said 
nucleic acid molecule is isolated, enriched, or purified from a 

20 bacteria. 

44. The nucleic acid molecule of claim 43, wherein said 
bacteria is Klebsiella oxytoca . 

45. A recombinant cell, comprising the nucleic acid 
molecule of claim 42 . 

25 4 6. A recombinant cell useful for screening for one or more 

nucleic acid sequences that express one or more products that 
convert a source compound into a target compound, wherein said 
cell expresses one or more genes comprising an inducible 
promoter, and wherein said one or more genes encodes one or more 
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proteins that in the presence of said target compound and an 
inducer provide a detectable signal, wherein said detectable 
signal indicates the presence of said one or more nucleic acid 
sequences . 

47. The recombinant cell of claim 46, wherein said one or 
more nucleic acid sequences encodes a metabolic pathway not 
normally present in said cell. 

48. The recombinant cell of claim 47, wherein said one or 
more nucleic acid sequences are selected from the group 
consisting of mutagenized DNA, environmental DNA, combinatorial 
libraries, and recombinant DNA. 



49. The recombinant cell of claim 48, wherein said 
environmental DNA is isolated from one or more sources selected 
from the group consisting of mud, soil, water, sewage, flood 

15 control channels, and sand. 

50. The recombinant cell of claim 48, wherein said 
mutagenized DNA is the result of enzyme mutagenesis wherein said 
mutagenesis is selected from the group consisting of random, 
chemical, PCR-based, and directed mutagenesis. 



51. The method of claim 50, wherein said enzyme is selected 
from the group consisting of lactonases, esterhydrolases, and 
reductases . 

52. The recombinant cell of claim 46, wherein said 
detectable signal is selected from a group consisting of growth, 

25 fluorescence, luminescence, and color. 

53. The recombinant cell of claim 46, wherein said 
detectable signal is growth. 
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54. The recombinant cell of claim 53, wherein said cell 
requires the presence of said target compound and said inducer 
for growth, 

55. The recombinant cell of claim 54, wherein said target 
5 compound is selected from the group consisting of ascorbate and 

2-Keto-L-Gulonate , 

56. The recombinant cell of claim 4 6, wherein said one or 
more genes are under the control of said inducible promoter. 

57. The recombinant cell of claim ^ 56, wherein said 
10 inducible promoter comprises the trp-lac hybrid promoter, the 

iacO operator, and the iacl^ repressor gene. 

58. The recombinant cell of claim 56, wherein said one or 
more proteins comprise one or more Yia operon-related 
polypeptides , 

15 59. The recombinant cell of claim 58, wherein said cell 

naturally expresses said one or more genes. 

60. The recombinant cell of claim 58, wherein said cell has 
been genetically manipulated to express said one or more genes. 

61. The recombinant cell of claim 58, wherein said cell is 
20 a bacteria. 

62. The recombinant cell of claim 61, wherein said bacteria 
is Klebsiella oxytoca , 

63. A method for identifying a substance that modulates the 
conversion of a source compound to a target compound, comprising: 

25 contacting a cell with nucleic acid, wherein said nucleic 

acid expresses a product that converts a source compound into a 
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target compound, and wherein said cell expresses one or more 
proteins which in the presence of said target compound provide a 

detectable signal; 

contacting said cell with a test substance; and 
5 monitoring said detectable signal, wherein said detectable 

signal indicates the presence of said substance. 

64. The method of claim 63, wherein the substance is 
selected from the group consisting of antibodies, small organic 
molecules, peptidomimetics, and natural products. 

10 65. The method of claim 64, wherein said detectable signal 

is selected from a group consisting of growth, fluorescence, 
luminescence, and color. 

66. The method of claim 65, wherein said detectable signal 
is growth, and wherein said target compound is metabolizable to 

15 an element selected from the group consisting of carbon, 
nitrogen, sulfur, and phosphorous. 

67. The method of claim 66, wherein said element is carbon. 

68. The method of claim 63, wherein said source compound is 
selected from the group consisting of 2-Keto-L-Gulonate, 2,5- 
Deoxy-Keto-Gulonate, L-Idonate, L-Gulonate, and glucose. 



20 



69. The method of claim 63, wherein said one or more 
proteins are one or more Yia operon-related polypeptides. 

70. The method of claim 69, wherein said Yia operon further 
comprises a vector or promoter effective to initiate 

25 transcription in a host cell. 
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71. The method of claim 70, wherein said vector or promoter 
comprises the trp-Jac hybrid promoter, the iacO operator, and the 
lad'' repressor gene. 

72. A method for detecting the presence, absence, or amount 
5 of a compound in a sample comprising: 

contacting said sample with a cell, wherein said cell 
expresses one or more genes encoding one or more proteins that in 
the presence of said compound provide a detectable signal that 
indicates the presence, absence, or amount of said compound. 

10 '73. The method of claim 72, wherein^ said compound is 

ascorbate . 

74. The method of claim 72, wherein said detectable signal 
is selected from a group consisting of growth, fluorescence, 
luminescence, and color. 

1^ '^^^ method of claim 72, wherein said one or more genes 

comprises yiaJ. 

76. The method of claim 75, wherein said one or more genes 
further comprises a promoter transcriptionally linked to a 
reporter gene. 

^° '^^^ method of claim 76, wherein YiaJ is naturally 

expressed in said cell. 

78. The method of claim 76, wherein said cell has been 
genetically manipulated to express said yiaJ. 

79. The method of claim 76, wherein the expression of said 
25 reporter gene is regulated by the binding of YiaJ to said 

promoter. 
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80. The method of claim 72, wherein said cell is a 
bacteria. 

81. The method of claim 80, wherein said bacteria is 
Klebsiella oxytoca. 

5 82. An isolated, purified, or enriched nucleic acid 

molecule encoding YiaJ and a reporter gene. 

83. The nucleic acid molecule of claim 82, further 
comprising a promoter transcriptionally linked to said reporter 
gene. . 



10 



84. The nucleic acid molecule of claim 83, wherein the 
expression of said reporter gene is regulated by the binding of 
YiaJ to said promoter. 

85. A recombinant cell for detecting the presence, absence, 
or amount of a compound in a sample comprising the nucleic acid 

15 molecule of either of claims 82 or 83. 

86. A recombinant cell for detecting the presence, absence, 
or amount of a compound in a sample, wherein said cell expresses 
one or more genes encoding one or more proteins that in the 
presence of said compound provide a detectable signal, wherein 

20 said signal indicates the presence, absence, or amount of said 
compound . 

87. The recombinant cell of claim 86, wherein said 
detectable signal is selected from a group consisting of growth, 
fluorescence, luminescence, and color. 

25 88. The recombinant cell of claim 86, wherein said one or 

more genes comprises yiaJ. 
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89. The recombinant cell of claim 88, wherein said one or 
more genes further comprises a promoter transcriptionally linked 
to a reporter gene. 

90. The recombinant cell of claim 89, wherein YiaJ is 
5 naturally expressed in said cell. 

91. The recombinant cell of claim 89, wherein said cell has 
been genetically manipulated to express said yiaJ. 

92. The recombinant cell of claim 89, wherein the 
expression of said reporter gene is regulated by the binding of 

10 YiaJ to said promoter. 

93. The recombinant cell of claim 86, wherein said cell is 
a bacteria. 



94. The recombinant cell of claim 93, wherein said bacteria 
is Klebsiella oxytoca . 

15 95. A method of selection for a nucleic acid sequence 

encoding a metabolic pathway from a source compound to a target 

compound comprising: 

(1) identifying an organism that metabolizes a target 

compound to provide an essential element; 
20 (2) identifying one or more genes responsible for the 

metabolism of said target compound to said essential element; 

(3) expressing said one or more genes under the control of 
an inducible promoter, whereby said target compound is 
metabolized in the presence of an inducer and not in the absence 

25 of said inducer; 

(4) expressing nucleic acid sequences potentially encoding 
said metabolic pathway in said recipient organism; and 

(5) selecting said recipient organism for growth on said 
source compound in the absence of said target compound and in the 
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presence of said inducer, wherein growth on said source compound 
in the absence of said target compound and in the presence of 
said inducer indicates the presence of said nucleic acid 
sequence. 

96. The method of claim 95, wherein said essential element 
is selected from the group consisting of carbon, phosphorous, 
nitrogen, and sulfur. 

97. The method of claim 96, wherein said essential element 
is carbon. 

98. The method of claim 95, further comprising the transfer 
of said one or more genes to a highly genetically manipulatable 
recipient organism, such that said recipient organism metabolizes 
said target compound to provide an essential element. 
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MCCGCCnGCCGGCCGCGMCTGCCCGTGGACGGCGGAnCGACGAT6ACGGTCG.mGACCAMGAGCCGG(^ 
NRLAGRELPVDGGFDDDGRLTKEPGT I 

CGAGAAAMTCGCCGCATTTTACCCATGGGCTACTGGAAAGGnCCGGCCTGTCGATCGTGCTGGATATGATTGCCACCC 
EKNRRILPMGYWKGSGLSIVLOMIATL 

TCCTCTCCAACGGATCGTCGGnGCCGAAGTGACCCAGGAAAACAGCGATGAATATGGCGTTTCGCAGATCnCATCGCT 
LSNGSSVAEVTQENSDEYGVSQIFIA 

AnGAAGTGGATAAGCTGATCGACGGCGCAACCCGCGACGCCAAGCTGCAACGGATTATGGATTTCATCACCACCGCCGA 
lEVOKLIDGATRDAKLQRIMDFITTAE 

GCGCGCCGATGAAAATGTGGCGGTCCGTCnCCTGGCCATGAATTTACCCGTCTGCTGGATGAAAACCGCCGCAACGGCA 
RADENVAVRLPGHEFTRLLDENRRNGI 

nACCGTCGATGACAGCGTATGGGCCAAAAnCAGGCGCTGTAAGGAGCTCACCCATGACAGCGTATGGGCCAAAATTCA 
TVDOSVWAKIQAL* 

GGCGCTGTAAGGAGCTCACCCATGATTTTTGGTCATATTGCTCAACCTAATCCGTGTCGTCTGCCCGCGGCCAnGAGCG 
MIFGHIAQPNPCRLPAAIER 

GGCGCTTGATTTCCTGCGCACGACGGATTTCCACGCGCTGGCACCCGGCGTCGTGGAAATCGACGGCCAAAACATCTTCG 
ALDFLRTTDFHALAPGVVEIDGQNIFA 

CGCAGGnATCGACnAACCACTCGCGATGCCGCTGAAAATCGTCCGGAGGTCCACJCGTCGCTATCTGGATATCCAGTTT 
QVIDLTTRDAAENRPEVHRRYLDIQF 

CTGGCATCGGGCGAAgAAAAAATCGGTATCGCCAnGATACCGGCAATAATCAAATCAGCGAATCTTTAnAGAACAGCG 
LASGEEKIGIAIDTGNNQISESLLEQR 

CGATAHAI Nil I ATCACGACAGCGAACATGAATCGnCTTTGAAATGACGCCAGGCAACTATGCGATATnTTCCCGC 
DI I FYHDSEHESFFEMTPGNYAIFFPQ 

AAGATGnCATCGTCCTGGATGTAATAAAACTGTAGCCACGCCGATCCGCAAAATAGTCGnAAAGTCGCTATTTCAGTT 
DVHRPGCNKTVATPIRKIVVKVAISV 

TTATAAGAAGGAGCACAAAATGAAnCGAATAATACCGGnACAnATCGGTGCGTACCCCTGTGCCCCCTGTGCACCCT 
L* MNSNNTGYIIGAYPCAPCAPS 

CATTTCACCAAAAGAGTGAAGAGGAAGAGaTGGAATTCTGGCGGCAGCTCTCCGACACCCCGGATAnCGCGGGCTGGAG 
FHQKSEEEEMEFWRQLSDTPDIRGLE 

CAACCCTGCCTACCCTGCCnGAACATCnCATCCGCTCGGCGACGAGTGGTTAnGCGCCATACCCCGGGACACTGGCA 
QPCLPCLEHLHPLGDEWLLRHTPGHWQ 

GATTGTCGTTACCGCCATCATGGAAACCATGCGCCGCCGCGGTGAAAACGGCGGCTTTGGGCTGGCGTCCAGCGACGAAA 
IVVTAIMETMRRRGEN6GFGLASSDET 

CGCAGCGCAAAGCCTGCGTGGAGTACTATCGCCACCTGCAGCAGAAGATCGCTAAAATCAATGGCAATACCGCCGGAAAG 
QRKACVEYYRHLQQKIAKINGNTAGK 

GTCAnGCCCnGAGCnCACGCCGCCCCGCTGGCGGGCAATGCCAACGTGGCTCAGGCTACCGACGCCTTTGCCCGTTC 
VIALELHAAPLAGNANVAQATDAFARS 

ATTAAAAGAAAnACCCGCTGGGACTGGTCCTGCGAGCTGGTGCTGGAGCACTGCGACGCGATGACCGGCAGCGCGCCGC 
LKE ITRWOWSCELVLEHCDAMTGSAPR 

GCAAAGGATTmGCCGnAGAAAACGTGCTGGAAGCCAnGCCGATTATGACGngGCATTTGTAnAACTGGGCGCGT 
KGFLPLENVLEAIAOYDVGICINWAR 
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™G^CMLffiA™^^ 
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CCCTGA/CnCTTCTACCAgACCGGCAmAC^C™ 

AG^mGT™nGC«4TGCCGTACGT™ 
SMGQVGMLAlLPYVGAlAbrir 

AGACCGAACCGGTAAAa^C^OTCm^^^^ 

/gAALcclmGlcTCTcLTGCm^ 

KNOIHLSy AALVGCGFFLQSARlJV 

ACUTCCCGGCACCTCTGTTC^GCGGAAATGGCGS^^^^ 
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TGGCGCTGGCCGCGCTGATgGCGCTgCTGCTGCCGGCGAAATGCGATGCCgGTGCTGCGCCGGTaAAgACgATAAaTCCA 
ALAALMALLLPAKCDAGAA PVKTINP 

CATAAACGCACTGCGTAAACTCGAGCCCGGCGGCGCTgCGCCTGCCGGGCCTGCGAAATATGCCGGGTTCACCCGGTaAC 

H K R T A * 

AATqAGATGCgAAAgATGAGCAAgAAACAgGCCnCTGGCTGGGTAnGATTGCGGCGGCACCTATCTGAAAGCCGGTn 
MSKKQAFWLGIDC6GTYLKAGL 

ATATGACGCCGAAGGTCATGAACATGGCAnGTGCGGCAAGCGCTACGGACGATGTCGCCCCTGCCGGCnACGCCGAAC 
YDAEGHEHGIVRQALRTMSPLPG YAER 

GCGACATGCGCCAGCTCTGGCAACACTGCGCGGCGACCAnGCCGGGCTAnACAGCAGGCAGGTGTATCCGGCGAACAG 
0MRQLWQHCAATIA6LLQQAGVSGEQ 

ATTAAAGGCGTGGGCATCTCCGCTCAGGGTCAAGGGCTCTTTCTCCTCGATAAGCAGGATCGGCCGCTGGGTAACGCCAT 
IKGVGISAQGQGLFLLDKQDRPLGNAI 

CCTCTCCTCCGATCGTCGGGCGCTGAAAATCGTTCAGCGCTGGCAGCGGGACCGTAnCCCGAACGGCTCTATCCCGnA 
LSSORRALKIVQRWQRDRIPERLYPVT 

CCCGCCAGACGCTGTGGACCGGACATCCGGCTTCTTTGCTGCGCTGGGTAAAAGAGAATGAACCCCAGCGCTACGCGCAA 
RQTLWTGHPASLLRWVKENEPQRYAQ 

AnGGCTGCGTGATGATGGGGCATGACTATCTGCGCTGGTGCnAACCGGCGCGAAGGGCTGCGAGGAGAGCAACATCTC 
I6CVMMGHDYLRWCLTGAKGCEESNI S 

CGAGTCCAACCTCTACAACATG6CCATGGGCCAGTACGACCCGCGCCTGACCGAGTGGCTGGGCATCGGTGAAATCGATA 
ESNLYNMAMGQYDPRLTEWLGIGEIDS 

GCGCGCTGCCCCCCGTTGTAGGGTCAGCCGAAATrTGCGGGGAGATCACCGCTCAGGCAGCCGCTTTAACCGGTCTGGCG 
ALPPVVGSAEICGEITAQAAALTGLA 

GCGGGTA'-TCCCGTCGTTGGCGGCCTGnTGACGTGGTCTCCACCGCCCnTGCGCCGGGAnGAGGATGAGTCGACCCT 
AGTPVVGGLFDVVSTALCAGIEDESTL 

CAATGCGGTGATGGGGACCTGGGCCGTCACTAGCGGTATCGCTCACGGCCTGCGCGACCATGAGGCCCACCCnACGTCT 
NAVMGTWAVTSGIAHGLRDHEAHPYVY 

ATGGCCGCTACGTCAATGACGGCCGTATATCGnCACGAAGCCAGCCCGACCTCATCCGGCAACCTcGAATGGTTTACC 
GRYVNOGQYIVHEASPTSSGNLEW FT 

GCCCAGTGGGGCGATCTCTCGTTTGATGAGATCAATCAGGCCGTCGCCAGCCTGCCGAAAGCCGGGAGCGAGCTGl 1 1 1 i 
AQWGDLSFDEINQAVASLPKA6SELFF 

TCTGCCGTTTCTGTATGGCAGCAACGCCGGGCTGGAGATGACCTGCGGCTTTTACGGCATGCA6GCGCTGCATACCCGCG 
LPFLYGSNA6LEMTCGFYGMQALHTRA 

CGCACCTGCTGCAGGCGGTnATGAAGGCGTGGTATnAGCCATATGACCCACCTCAGCCGTATGCGCGAACGClTTACA 
HLLQAVYEGVVFSHMTHLSRMRERFT 

AACGnCAGGCCCTGCGCGTCACcGGCGGCCCGGCGCACTCCGACGTCTGGATGCAGATGCTGGCGGACGTAAGCGGCn 
NVQALRVT6GPAHSDVWMQMLADVSGL 

ACGCAnGAACTCCCGAAGGTGGAAGAGACcGGCTGTTnGGCGCGGCCCTCGCCGCTCGtGTcGGtACcGGCGTATACC 
RIELPKVEETGCFGAALAARVGTGVYR 

GCAGcTTTAGCGAAGCCCGGCGCGCCCGGCAGCACCCGGTGCGCACGcTGCTGCCCGATATGACCGCCCACGCGCGCTAT 
SFSEARRARQHPVRTLLPDMTAHARY 
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^ yiaQ 
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GCCGGTGCTGGAGAmTCCAGGCCCGGCGCTGGATCGMTCACGGATGCAGGAAGGGGGAnCACATGTTA^ 
PVLEI IQARRWIESRMQEGGFTC* 

M L E Q L 

GAAAGCCGAGGTACTGGCGGCAAACCTGGCCCTCCCCGCACACGGCCTGGTCACCTTTACCTGGGGCAACGTCAGCGCGG 
KAEVLAANLALPAHGLVTFTWGNVSAV 

TCGATGAAACGCGCAAGCTGATGGTCATTAAGCCtTCCGGCGTCGAATATGAGGTGATGACCGCCGACGATATGGTGGTC 
DETRKLMVIKPSGVEYEVMTADDMVV 

GTAGAGATGGCCAGCGGTAAAGTCGTTGAAGGCGGTAAAAAACCCTCnCAGATACGCCAACGCATCTGGCGCTTTATCG 
VEMASGKVVEGGKKPSSDTPTHLALYR 

CCGCTATCCGCAGATCGGCGGGATCGTGCATACCCACTCCCGCCACGC6ACGATCTGGTCGCAGGCCGGGCTCGATCTCC 
RYPQIGGIVHTHSRHATIWSQAGLDLP 

CcGCCTGGGGCACCACCCACGCCGACTACnCTATGGCGCGATCCCCTGTACCCGACGGATGACCGnGAGGAGATTAAC 
AWGTTHADYFYGA I PCTRRMTVEE I N 

GGCGAGTATGAGTATCAGACCGGCGAGGTGAnATCAAAACCTTTGAACAGCGCGGCCTGGATCCGGCGCAAATCCCGGC 
GEYEYQTGEVIIKT.FEQRG4.DPAQIPA 

GGTATTGGTCCAnCACACGGCCCCTTTGCCTGGGGTAAAGACGCCGCCGACGCCGTACATAACGCCGTGGTGCTGGAGG 
VLVHSHGPFAWGKDAADAVHNAVVLEE 

AGTGCGCCTACATGGGCCTCnCTCGCGCCAGTGGCCACAGCTGCCGGATATGCAGTCTGAACTGCTCGATAAACACTAT 
CAYM6LFSRQWPQLP0MQSELLDKHYL 

CTGCGTAAACACGGCGCGAACGCTAnACGGGCAAAACTAGTCCCGCGGAACTCCCCGGATAAGGCGCmrGGCCCCCGG 
RKHGANAITGKTSPAELPG 

GGGAAGCGTGCAGGATGTTGCTGAACTnCCCGGAGCGATGCTGCGCATCTGTCCGGGCTACGCGTCCCCGGCGCTCTGC 

GGTCAGCACCGCGCCCGGCGGAAAACCCATCAACCCTACGCCGAATTAATATGTCCnGCAGTAACGACGCnCCACGCC 
GCCGGTCCAGGCTGGTGTGCnGCGGAAAATCnGCGAAAATAGCCGACATCGnAAACCCGCATTTCATCGCCACCTCG 
GTAATCGACAGGGAATCGCTGATAAGCAGCTTrrCCGCCGCCCTTACCCGCTGACGGTGCAGCGCTTCGGTAACGTCAGC 
CGGAAAGCATGGCGATAAACGGCCCCAGATAACCCGCGTTGCAGTGCAGCTCCT 
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SEQUENCE LISTING 

<110> Hoch, James 

Dartois, Veronique 

<120> METABOLIC SELECTION METHODS 

<130> WESLEY B. AMES: Microgenomics 

<140> 
<14I> 

<160> 33 

<170> PatentIn Ver. 2.0 

<210> 1 
<211> 816 
<212> DNA 

<213> yia j ^ 
<400> 1 

atgggcacaa aagaaagcga gaacacgcaa gataaagaga ggcctgccgg aagtcagagc 60 
ctttttcgtg ggttgatgct aattgagatc ctgagtaatt atccaaatgg ctgtcccgtg 120 
gcgcatctgt cggaactggc gggactgaac aaaagtaccg ttcatcgctt attacagggg 180 
ctgcagtcct gcgggtacgt gacgcctgcc ccggcggcgg ggagctatgc gctgacgaca 240 
aaatttatcc gcgttggcca aaaggcgttg tcgtcgctga atattatcca cgtcgcggcg 300 
ccgcatcttg aggcgcttaa cctggccacc ggcgagacgg tgaacttctc cagccgtgaa 360 
gatgaccacg cgatcctgat ttataagctg gagccgacca ccggtatgct gcgtacgcgc 420 
gcctatattg gccagcacat gcgctgtact gctcggcaat gggcaaagat ttdtatggcg 480 
tttggccatc ctgactacgt tgagagctac tggaattcac accaggagat tatccagccg 540 
ctgacccgta ataccattac cggcttgcct gcgatgcatg atgaactggc gcagatccgc 600 
gagcgaaata tggcgatgga cagggaagag aacgagctgg gcgtgtcgtg cctggctgtc 660 
cccgtttttg atatccatgg gcgcgtgcct tatgccattt ctatctctct atcaacatcg 720 
cgcctcaagc aggtgggaga gaaaaattta ctcaagccgc tacgcgatac ggcagaggcg 780 
atttctcgcg aactgggctt ttccgtgcgg gaaggt 816 

<210> 2 
<211> 996 
<212> DNA 
<213> yia k 

<400> 2 

atgaaagtca cgtttgagca gttaaaagag gcattcaatc gggtactgct ggacgcgtgc 60 

gtcgcccggg aaaccgccga tgcctgcgca gaaatgtttg cccgcaccac cgaatccggc 120 

gtctattctc acggcgtgaa ccgctttcct cgcttcatcc agcagttgga taacggcgac 180 

attatccctg aggctcaacc gcagcgggtg accacgctcg gcgccatcga acagtgggat 240 

gctcagcgtt ccatcggcaa cctgacggcg aaaaagatga tggatcgggc cattgagctg 300 
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gcctccgatc acggtatcgg cctggtcgcc ttacgtaatg ctaaccactg gatgcgcggc 360 
ggcagctacg gctggcaggc ggcggaaaaa ggctacatcg gtatctgctg gaccaactcc 4 20 
atcgccgtta tggcgccatg gggcgctaaa gagtgccgta tcggtaccaa cccgctgatc 480 
gtcgccattc cgtcgacgcc gatcaccatg gtggatatgt cgatgtcgat gttctcctac 54 0 
ggcatgctgg aggttaaccg ccttgccggc cgcgaactgc ccgtggacgg cggattcgac 600 
gatgacggtc gtttgaccaa agagccgggg acgatcgaga aaaatcgccg cattttaccc 660 
atgggctact ggaaaggttc cggcctgtcg atcgtgctgg atatgattgc caccctcctc 720 
tccaacggat cgtcggttgc cgaagtgacc caggaaaaca gcgatgaata tggcgtttcg 780 
cagatcttca tcgctattga agtggataag ctgatcgacg gcgcaacccg cgacgccaag 840 
ctgcaacgga ttatggattt catcaccacc gccgagcgcg ccgatgaaaa tgtggcggtc 900 
cgtcttcctg gccatgaatt tacccgtctg ctggatgaaa accgccgcaa cggcattacc 960 
gtcgatgaca gcgtatgggc caaaattcag gcgctg 



<210> 3 
<211> 462 
<212> DNA 
<213> yia 1 



<400> 3 

atgatttttg gtcatattgc tcaacctaat 
gcgcttgatt tcctgcgcac gacggatttc 
gacggccaaa acatcttcgc gcaggttatc 
cgtccggagg tccaccgtcg ctatctggat 
atcggtatcg ccattgatac cggcaataat 
gatattattt tttatcacga cagcgaacat 
tatgcgatat ttttcccgca agatgttcat 
ccgatccgca aaatagtcgt taaagtcgct 



h ■ 

ccgtgtcgtc tgcccgcggc cattgagcgg 60 
cacgcgctgg cacccggcgt cgtggaaatc 120 
gacttaacca ctcgcgatgc cgctgaaaat 180 
atccagtttc tggcatcggg cgaagaaaaa 240 
caaatcagcg aatctttatt agaacagcgc 300 
gaatcgttct ttgaaatgac gccaggcaac 360 
cgtcctggat gtaataaaac tgtagccacg 420 
atttcagttt ta ^62 



<210> 4 
<211> 945 
<212> DNA 
<213> orfl 

<400> 4 

atgaattcga ataataccgg ttacattatc 
tcatttcacc aaaagagtga agaggaagag 
ccggatattc gcgggctgga gcaaccctgc 
ggcgacgagt ggttattgcg ccataccccg 
atggaaacca tgcgccgccg cggtgaaaac 
acgcagcgca aagcctgcgt ggagtactat 
aatggcaata ccgccggaaa ggtcattgcc 
aatgccaacg tggctcaggc . taccgacgcc 
tgggactggt cctgcgagct ggtgctggag 
cgcaaaggat ttttgccgtt agaaaacgtg 
atttgtatta actgggcgcg ttcggccatt 
catacgcagc aggtaaaacg ggcaggaaag 
cagaccggcg agtacggcga atggcaggat 
cagagcctga tgaccaccga acacgctcgt 



ggtgcgtacc cctgtgcccc ctgtgcaccc 60 
atggaattct ggcggcagct ctccgacacc 120 
ctaccctgcc ttgaacatct tcatccgctc 180 
ggacactggc agattgtcgt taccgccatc 240 
ggcggctttg ggctggcgtc cagcgacgaa 300 
cgccacctgc agcagaagat cgctaaaatc 360 
cttgagcttc acgccgcccc gctggcgggc 420 
tttgcccgtt cattaaaaga aattacccgc 480 
cactgcgacg cgatgaccgg cagcgcgccg 540 
ctggaagcca ttgccgatta tgacgttggc 600 
gaagggcgga ataccgtgct accgctcacc 660 
ctcggcgcgc tgatgttttc tggcacgacg 720 
ttacacgcgc cgttcgcgcc tttctgcccg 780 
gaattatttg cctgcgcagg aaccgccccc 840 
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ctgcaatttt caggcattaa attactggaa attaatgcca gcgcaaacgt tgatcatcgc 900 
atcgcgatat tacgcgacgg catctccgcg ctaaaacaag cacaa 945 

<210> 5 
<211> 1317 
<212> DNA 
<213> yia x2 



<400> 5 

atgaatataa cctctaactc tacaaccaaa 
attccgccta tactgatcac ttgtattatt 
gcgatgcccg gaggtatgga tgccgactta 
ggcggtattt tctttatcgg ttatctattt 
cacggtagcg gtaagaaatt tatcggctgg 
ctgacggggt taattaccaa tcagtaccag 
gcggaaggcg gtatgctgcc ggtcgttctc 
gaacgcggtc gcgccaacgc gattgtcatt 
gccccactct caggctggat tatcacggtt 
ggtttgctct cgctggttgt tctggttctg 
gaagcgcgct ggatttccga agcagagaag 
caaaaagcca ttgccggcac cgaggtgaaa 
aaaaccatgt ggcagcttat cgccctgaac 
accctgtggc tacccaccat tctgaaagaa 
atgcttgcca ttctgccgta cgtcggcgcc 
tcagaccgaa ccggtaaacg caagctgttc 
tgcatgttcc tgtcggtggc gctgaaaaac 
ggctgcggat tcttcctgca atcggcggct 
ttcagcgcgg aaatggcggg cggcgcgcgc 
ggattttgtg gcccttatgc ggtcggggtg 
gtctattgcc tggcgatctc cctggcgctg 
aaatgcgatg ccggtgctgc gccggtaaag 

<210> 6 
<211> 1503 
<212> DNA 
<213> lyxk 



gatataccgc gccagcgctg gttaagaatc 60 
tcttatatgg accgggtcaa tattgccttt 120 
ggtatttccg ccaccatggc ggggctggcg 180 
ttacaggttc ccggcgggaa aattgccgtt 240 
tcgctggtcg cctgggcggt catctccgtg 300 
ctgctggccc tgcgcttctt actgggcgtg 360 
acgatgatca gtaactggtt ccccgacgct 420 
atgtttgtgc cgattgccgg gattatcacc 480 
ctcgactggc gctggctgtt tattatcgaa 540 
tgggcataca ccatctat^a ccgtccgcag 600 
cgctatctgg tcgagacgct ggccgcggag 660 
aacgcctctc tgagcgccgt tctctccgac 720 
ttcttctacc agaccggcat ttacggctac 780 
ttgacccata gcagcatggg gcaggtcggc 840 
attgctggga tgttcctgtt ttcctccctt 900 
gtctgcctgc cgctgattgg cttcgctctg 960 
caaatttggc tctcctatgc cgcgctggtc 1020 
ggcgtgttct ggaccatccc ggcacgtctg 1080 
ggggttatca acgcgcttgg caacctcggc 1140 
ctgatcacgt tgtacagcaa agacgctggc 1200 
gccgcgctga tggcgctgct gctgccggcg 1260 
acgataaatc cacataaacg cactgcg 1317 



<400> 6 

atgagcaaga aacaggcctt ctggctgggt 

ggtttatatg acgccgaagg tcatgaacat 

tcgcccctgc cgggttacgc cgaacgcgac 

accattgccg ggctattaca gcaggcaggt 

atctccgctc agggtcaagg gctctttctc 

gccatcctct cctccgatcg tcgggcgctg 

attcccgaac ggctctatcc cgttacccgc 

ttgctgcgct gggtaaaaga gaatgaaccc 

atggggcatg actatctgcg ctggtgctta 

atctccgagt ccaacctcta caacatggcc 



attgattgcg gcggcaccta tctgaaagcc 60 
ggcattgtgc ggcaagcgct acggacgatg 120 
atgcgccagc tctggcaaca ctgcgcggcg 180 
gtatccggcg aacagattaa aggcgtgggc 240 
ctcgataagc aggatcggcc gctgggtaac 300 
aaaatcgttc agcgctggca gcgggaccgt 360 
cagacgctgt ggaccggaca tccggcttct 4 20 
cagcgctacg cgcaaattgg ctgcgtgatg 480 
accggcgcga agggctgcga ggagagcaac 540 
atgggccagt acgacccgcg cctgaccgag 600 
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tggctgggca tcggtgaaat cgatagcgcg ctgccccccg ttgtagggtc agccgaaatt 660 
tgcggggaga tcaccgctca ggcagccgct ttaaccggtc tggcggcggg tactcccgtc 720 
gttggcggcc tgtttgacgt ggtctccacc gccctttgcg ccgggattga ggatgagtcg 780 
accctcaatg cggtgatggg gacctgggcc gtcactagcg gtatcgctca cggcctgcgc 840 
gaccatgagg cccaccctta cgtctatggc cgctacgtca atgacggcca gtatatcgtt 900 
cacgaagcca gcccgacctc atccggcaac ctcgaatggt ttaccgccca gtggggcgat 960 
ctctcgtttg atgagatcaa tcaggccgtc gccagcctgc cgaaagccgg gagcgagctg 1020 
ttttttctgc cgtttctgta tggcagcaac gccgggctgg agatgacctg cggcttttac 1080 
ggcatgcagg cgctgcatac ccgcgcgcac ctgctgcagg cggtttatga aggcgtggta 1140 
tttagccata tgacccacct cagccgtatg cgcgaacgct ttacaaacgt tcaggccctg 1200 
cgcgtcaccg gcggcccggc gcactccgac gtctggatgc agatgctggc ggacgtaagc 1260 
ggcttacgca ttgaactccc gaaggtggaa gagaccggct gttttggcgc ggccctcgcc 1320 
gctcgtgtcg gtaccggcgt ataccgcagc tttagcgaag cccggcgcgc ccggcagcac 1380 
ccggtgcgca cgctgctgcc cgatatgacc gcccacgcgc gctatcagcg caaataccgc 1440 
cactacctgc atttgattga agcactacag ggctatcacg cccgtattaa ggagcacgca 1500 

1503 

tta 

<210> 7 

<211> 660 V 
<212> DNA 
<213> yia q 

<400> 7 

atgagccgac cattactgca gctggcgctc gaccatacca gccttcaggc tgcgcagcgc 60 
gatgtcgccc tgctacagga tcacgttgat attgtggagg cgggaaccat cctctgctta 120 
accgaagggc ttagcgcggt taaagccctg cgcgcccagt gtccggggaa gatcatcgtc 180 
gccgactgga aagtcgccga cgccggtgaa accctggcgc agcaggcctt tggcgctggc 240 
gccaactgga tgaccatcat ttgcgccgca ccgctcgcca cggtcgagaa aggccacgcc 300 
gtggcccagg cctgcggcgg tgaaattcag atggagctgt tcggcaactg gacgctggat 360 
gacgcccgcg cctggtaccg taccggcgtc catcaggcga tttaccatcg cggacgcgat 420 
gcccaggcca gcgggcagca gtggggggag gcggatctgg cgcgcatgaa agcgctgtcc 480 
gatattggcc ttgagctatc gattaccggc ggcattaccc cagccgatct accgctgttc 540 
aaagatatca acgtcaaagc ctttattgcc gggcgcgcgc tggcaggcgc cgcccatccg 600 
gcgcgggttg ccgccgaatt ccacgcgcaa atcgacgcta tctggggaga acagcatgcg 660 



<210> 8 
<211> 858 
<212> DNA 
<213> yia r 



<400> 8 

atgcgtaacc acccgttagg tatttatgaa 
gagcggctgg tactggccaa aagcrgcggt 
accgatgaac gcctttcgcg cctggagtgg 
gcgatgctgg aaaccgcggt cgccattccc 
ccctttggca gccgcgatga agcgctacgc 
atccgcctgg cgcgcgatct ggggatccgc 
tacgaagagc atgatgaagg cacccggcag 



aaagcgctgg cgaaggatct cagctggcct 60 
tttgattttg tcgaaatgtc ggtggacgag 120 
accccggccc agcgcgcatc gctggtgagc 180 
tcgatgtgct tgtccgccca tcgccgtttc 240 
gatcgggcgc gagagattat gaccaaagcc 300 
accatccagc tggcgggtta cgacgtctat 360 
cgttttgccg aagggctggc ctgggcggta 420 
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gaacaggccg ccgccgcgca ggtaatgctg 
aactccatca gcaaatggaa aaagtgggac 
tacccggacg tcggcaacct cagcgcctgg 
ggcatcgatc gtatcgccgc catccacctg 
cctggccagt tccgcgacgt gccgttcggc 
aagacgctgc gcgagctgaa ctaccgcggt 
gccagcgagc cggtgctgga gattatccag 
gaagggggat tcacatgt 

<210> 9 
<211> 714 
<212> DNA 
<213> yia s 
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gcggtggaga tcatggacac cgcctttatg 480 
gagatgcttt cgtcacc^tg gtttaccgtc 540 
ggaaacgacg tcaccgccga gctgaagctg 600 
aaagatacgc tgcccgtgac cgacgatagc 660 
gaaggatgcg tcgattttgt cggcattttt 720 
tcatttttga ttgagatgtg gacggagaaa 780 
gcccggcgct ggatcgaatc acggatgcag 840 

858 



<400> 9 

atgttagaac aactgaaagc cgaggtactg 
ctggtcacct ttacctgggg caacgtcagc 
attaagcctt ccggcgtcga atatgaggtg 
atggccagcg gtaaagtcgt tgaaggcggt 
ctggcgcttt atcgccgcta tccgcagatc 
gcgacgatct ggtcgcaggc cgggctcgat 
tacttctatg gcgcgatccc ctgtacccga 
tatgagtatc agaccggcga ggtgattatc 
gcgcaaatcc cggcggtatt ggtccattca 
gccgacgccg tacataacgc cgtggtgctg 
cgccagtggc cacagctgcc ggatatgcag 
aaacacggcg cgaacgctat tacgggcaaa 

<210> 10 
<211> 272 
<212> PRT 
<213> YiaJ-Ko 



gcggcaaacc tggccctccc cgcacacggc 60 
gcggtcgatg aaacgcgcaa gctgatggtc 120 
atgaccgccg acgatatggt ggtcgtagag 180 
aaaaaaccct cttcagat^c gccaacgcat 240 
ggcgggatcg tgcataccca ctcccgccac 300 
ctccccgcct ggggcaccac ccacgccgac 360 
cggatgaccg ttgaggagat taacggcgag 420 
aaaacctttg aacagcgcgg cctggatccg 480 
cacggcccct ttgcctgggg taaagacgcc 540 
gaggagtgcg cctacatggg cctcttctcg 600 
tctgaactgc tcgataaaca ctatctgcgt 660 
actagtcccg cggaactccc cgga 714 



<400> 10 
Met Gly Thr Lys 
1 

Gly Ser Gin Ser 
20 

hsn Tyr Pro Asn 
35 

Leu Asn Lys Ser 
50 

Gly Tyr Val Thr 
65 



Giu Ser Glu Asn 
5 

Leu Phe Arg Gly 



Gly Cys Pro Val 
40 

Thr Val His Arg 
55 

Pro Ala Pro Ala 
70 



Thr Gin Asp Lys 
10 

Leu Met Leu lie 
25 

Ala His Leu Ser 



Leu Leu Gin Gly 
60 

Ala Gly Ser Tyr 
75 



Glu Arg Pro Ala 
15 

Glu lie Leu Ser 
30 

Glu Leu Ala Gly 
45 

Leu Gin Ser Cys 



Ala Leu Thr Thr 

80 



5 
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Lys Phe lie Arg Val Gly Gin Lys Ala Leu Ser Ser Leu As n He He 
85 90 

His Val Ala Ala Pro His Leu Glu Ala Leu Asn Leu Ala Thr Gly Glu 
100 105 110 

Thr Val Asn Phe Ser Ser Arg Glu Asp Asp His Ala He Leu He Tyr 
115 120 125 

Lys Leu Glu Pro Thr Thr Gly Met Leu Arg Thr Arg Ala Tyr He Gly 
130 135 140 

Gin His Met Arg Cys Thr Ala Arg Gin Trp Ala Lys He Tyr Met Ala 
145 150 155 160 

Phe Gly His Pro Asp Tyr Val Glu Ser Tyr Trp Asn Ser His Gin Glu 
165 170 175 

He He Gin Pro Leu Thr Arg Asn Thr He Thr Gly Leu Pro Ala Met 
180 185 190 

His Asp Glu Leu Ala Gin He Arg Glu Arg Asn Met Ala Met Asp Arg 
195 200 205 

Glu Glu Asn Glu Leu Gly Val Ser Cys Leu Ala Val Pro Val Phe Asp 
210 215 220 

He His Gly Arg Val Pro Tyr Ala He Ser He Ser Leu Ser Thr Ser 
225 230 235 240 

Arg Leu Lys Gin Val Gly Glu Lys Asn Leu Leu Lys Pro Leu Arg Asp 
245 250 255 



Thr Ala Glu Ala He Ser Arg Glu Leu Gly Phe Ser Val Arg Glu Gly 
260 265 270 



<210> 11 
<211> 332 
<212> PRT 
<213> YiaK-Ko 

<400> 11 

Met Lys Val Thr Phe Glu Gin Leu Lys Glu Ala Phe Asn Arg Val Leu 
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1 

Leu Asp Ala Cys 
20 

Phe Ala Arg Thr 
35 

Phe Pro Arg Phe 
50 

Ala Gin Pro Gin 
65 

Ala Gin Arg Ser 



Ala lie Glu Leu 
100 

Asn Ala Asn His 
115 

Glu Lys Gly Tyr 
130 

Ala Pro Trp Gly 
145 

Val Ala He Pro 



Met Phe Ser Tyr 
180 

Leu Pro Val Asp 
195 

Pro Gly Thr He 
210 

Lys Gly Ser Gly 
225 

Ser Asn Gly Ser 



Tyr Gly Val Ser 



5 

Val Ala Arg Glu 



Thr Glu Ser Gly 
40 

He Gin Gin Leu 

55 

Arg Val Thr Thr 
70 

He Gly Asn Leu 
85 

Ala Ser Asp His 



Trp Met Arg Gly 
120 

He Gly He Cys 
135 

Ala Lys Glu Cys 
150 

Ser Thr Pro He 
165 

Gly Met Leu Glu 



Gly Gly Phe Asp 
200 

Glu Lys Asn Arg 
215 

Leu Ser He Val 
230 

Ser Val Ala Glu 
245 

Gin He Phe He 



10 

Thr Ala Asp Ala 
25 

Val Tyr Ser His 



Asp Asn Gly Asp 
60 

Leu Gly Ala He 
75 

Thr Ala Lys Lys 
90 

Gly He Gly Leu 
105 

Gly Ser Tyr Gly 



Trp Thr Asn Ser 
140 

Arg He Gly Thr 
155 

Thr Met Val Asp 
170 

Val Asn Arg Leu 
185 

Asp Asp Gly Arg 



Arg He Leu Pro 
220 

Leu Asp Met He 
235 

Val Thr Gin Glu 
250 

Ala He Glu Val 



15 

Cys Ala Glu Met 
30 

Gly Val Asn Arg 
45 

He He Pro Glu 



Glu Gin Trp Asp 
80 

Met Met Asp Arg 
95 

Val Ala Leu Arg 
110 

Trp Gin Ala Ala 
125 

He Ala Val Met 



Asn Pro Leu He 
160 

Met Ser Met Ser 
175 

Ala Gly Arg Glu 
190 

Leu Thr Lys Glu 
205 

Met Gly Tyr Trp 



Ala Thr Leu Leu 
240 

Asn Ser Asp Glu 
255 

Asp Lys Leu He 



BNSDOCID: <WO..., OO22170At IA> 



wo 00/22170 PCT/US99/23862 

260 265 270 

Asp Gly Ala Thr Arg Asp Ala Lys Leu Gin Arg lie Met Asp Phe lie 
275 280 285 

Thr Thr Ala Glu Arg Ala Asp Glu Asn Val Ala Val Arg Leu Pro Gly 
290 295 300 

His Glu Phe Thr Arg Leu Leu Asp Glu Asn Arg Arg Asn Gly lie Thr 
305 310 315 320 

Val Asp Asp Ser Val Trp Ala Lys He Gin Ala Leu 
325 330 



<210> 12 
<211> 154 
<212> PRT 

<213> YiaL-Ko \ 
<400> 12 

Met He Phe Gly His He Ala Gin Pro Asn Pro Cys Arg Leu Pro Ala 
15 10 15 

Ala He Glu Arg Ala Leu Asp Phe Leu Arg Thr Thr Asp Phe His Ala 
20 25 30 

Leu Ala Pro Gly Val Val Glu He Asp Gly Gin Asn He Phe Ala Gin 
35 40 45 

Val He Asp Leu Thr Thr Arg Asp Ala Ala Glu Asn Arg Pro Glu Val 
50 55 60 

His Arg Arg Tyr Leu Asp He Gin Phe Leu Ala Ser Gly Glu Glu Lys 
65 70 75 80 

He Gly He Ala He Asp Thr Gly Asn Asn Gin He Ser Glu Ser Leu 
85 90 95 

Leu Glu Gin Arg Asp He He Phe Tyr His Asp Ser Glu His Glu Ser 
100 105 110 

Phe Phe Glu Met Thr Pro Gly Asn Tyr Ala He Phe Phe Pro Gin Asp 
n5 120 125 

Val His Arg Pro Gly Cys Asn Lys Thr Val Ala Thr Pro He Arg Lys 
130 135 140 
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lie Val Val Lys Val Ala He Ser Val Leu 
145 150 



<210> 13 
<211> 315 
<212> PRT 
<213> ORFl 

<400> 13 

Met Asn Ser Asn Asn Thr Gly Tyr He He Gly Ala Tyr Pro Cys Ala 
15 10 15 

Pro Cys Ala Pro Ser Phe His Gin Lys Ser Glu Glu Glu Glu Met Glu 
20 25 30 

Phe Trp Arg Gin Leu Ser Asp Thr Pro Asp He Arg Gly Leu Glu Gin 
35 40 45 

Pro Cys Leu Pro Cys Leu Glu His Leu His Pro Leu Gly Asp Glu Trp 
50 55 60 

Leu Leu Arg His Thr Pro Gly His Trp Gin He Val Val Thr Ala He 
" 70 75 80 

Met Glu Thr Met Arg Arg Arg Gly Glu Asn Gly Gly Phe Gly Leu Ala 
85 90 95 

Ser Ser Asp Glu Thr Gin Arg Lys Ala Cys Val Glu Tyr Tyr Arg His 
100 105 110 

Leu Gin Gin Lys He Ala Lys He Asn Gly Asn Thr Ala Gly Lys Val 
115 120 125 

He Ala Leu Glu Leu His Ala Ala Pro Leu Ala Gly Asn Ala Asn Val 
130 135 



Arg 



Ala Gin Ala Thr Asp Ala Phe Ala Arg Ser Leu Lys Glu He Thr 

150 155 

Trp Asp Trp Ser Cys Glu Leu Val Leu Glu His Cys Asp Ala Met Thr 
165 170 175 

Gly Ser Ala Pro Arg Lys Gly Phe Leu Pro Leu Glu Asn Val Leu Glu 
180 185 190 

Ala He Ala Asp Tyr Asp Val Gly He Cys He Asn Trp Ala Arg Ser 
195 200 205 
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Ala He Glu Gly Arg Asn Thr Val Leu Pro Leu Thr His ' Thr Gin Gin 

210 215 220 

val Lys Arg Ala Gly Lys Leu Gly Ala Leu Met Phe Ser Gly Thr Thr 

230 235 240 



225 



Gl 



n Thr Gly Glu Tyr Gly Glu Trp Gin Asp Leu His Ala Pro Phe Ala 



245 



250 



255 



Pro Phe Cys Pro Gin Ser Leu Met Thr Thr Glu His Ala Arg Glu Leu 

265 270 



260 



Phe Ala Cys Ala Gly Thr Ala Pro Leu Gin Phe Ser Gly He Lys Leu 
275 280 285 



Leu Glu He Asn Ala Ser Ala Asn Val Asp His Arg lie Ala He Leu 
290 295 300 

Arg Asp Gly He Ser Ala Leu Lys Gin Ala Gin 
305 310 315 



<210> 14 
<211> 439 
<212> PRT 
<213> YiaX2 



<400> 14 

Met Asn He Thr Ser Asn Ser Thr Thr Lys Asp He Pro Arg Gin Arg 
1 5 10 15 



Trp Leu Arg He He Pro Pro 

20 25 



He Leu He Thr Cys He He Ser Tyr 

30 



Met 



Asp Arg Val Asn He Ala Phe Ala Met Pro Gly Gly Met Asp Ala 



35 



40 



45 



Asp Leu Gly He Ser Ala Thr Met Ala Gly Leu Ala Gly Gly He Phe 
50 55 60 

Phe He Gly Tyr Leu Phe Leu Gin Val Pro Gly Gly Lys He Ala Val 

75 80 



65 



70 



His Gly Ser Gly Lys Lys Phe He Gly Trp Ser Leu Val Ala Trp Ala 
85 90 95 



Val He Ser Val Leu Thr Gly Leu He Thr Asn Gin Tyr Gin Leu Leu 



10 
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100 105 110 

Ala Leu Arg Phe Leu Leu Gly Val Ala Glu Gly Gly Met Leu Pro Val 
115 120 125 

Val Leu Thr Met He Ser Asn Trp Phe Pro Asp Ala Glu Arg Gly Arg 
130 135 140 

Ala Asn Ala He Val He Met Phe Val Pro He Ala Gly He He Thr 
145 150 155 160 

Ala Pro Leu Ser Gly Trp He He Thr Val Leu Asp Trp Arg Trp Leu 
165 170 175 

Phe He He Glu Gly Leu Leu Ser Leu Val Val Leu Val Leu Trp Ala 
180 185 190 

Tyr Thr He Tyr Asp Arg Pro Gin Glu Ala Arg Trp lie Ser Glu Ala 
195 200 205 ^ 

Glu Lys Arg Tyr Leu Val Glu Thr Leu Ala Ala Glu Gin Lys Ala He 
210 215 220 

Ala Gly Thr Glu Val Lys Asn Ala Ser Leu Ser Ala Val Leu Ser Asp 
225 230 235 240 

Lys Thr Met Trp Gin Leu He Ala Leu Asn Phe Phe Tyr Gin Thr Gly 
245 250 255 

He Tyr Gly Tyr Thr Leu Trp Leu Pro Thr He Leu Lys Glu Leu Thr 
260 265 270 

His Ser Ser Met Gly Gin Val Gly Met Leu Ala He Leu Pro Tyr Val 
2*75 280 285 

Gly Ala He Ala Gly Met Phe Leu Phe Ser Ser Leu Ser Asp Arg Thr 
290 295 .300 

Gly Lys Arg Lys Leu Phe Val Cys Leu Pro Leu He Gly Phe Ala Leu 
305 310 315 320 

Cys Met Phe Leu Ser Val Ala Leu Lys Asn Gin He Trp Leu Ser Tyr 
325 330 335 

Ala Ala Leu Val Gly Cys Gly Phe Phe Leu Gin Ser Ala Ala Gly Val 
340 345 350 

Phe Trp Thr He Pro Ala Arc Leu Phe Ser Ala Glu Met Ala Gly Gly 

11 



BNSOOCID: <WO_ 0O2217OA1_IA> 



wo 00/22170 PCT/US99/23862 

355 360 365 

Ala Arg Gly Val He Asn Ala Leu Gly Asn Leu Gly Gly Phe Cys Gly 
370 375 380 

Pro Tyr Ala Val Gly Val Leu He Thr Leu Tyr Ser Lys Asp Ala Gly 
385 390 395 400 

Val Tyr Cys Leu Ala He Ser Leu Ala Leu Ala Ala Leu Met Ala Leu 
405 410 415 

Leu Leu Pro Ala Lys Cys Asp Ala Gly Ala Ala Pro Val Lys Thr He 
420 425 430 

Asn Pro His Lys Arg Thr Ala 
435 



<210> 15 V 
<211> 501 
<212> PRT 
<213> LyxK-Ko 

<400> 15 

Met Ser Lys Lys Gin Ala Phe Trp Leu Gly He Asp Cys Gly Gly Thr 
15 10 15 

Tyr Leu Lys Ala Gly Leu Tyr Asp Ala Glu Gly His Glu His Gly He 
20 25 30 

Val Arg Gin Ala Leu Arg Thr Met Ser Pro Leu Pro Gly Tyr Ala Glu> 
35 40 45 

Arg Asp Met Arg Gin Leu Trp Gin His Cys Ala Ala Thr He Ala Gly 
50 55 60 

Leu Leu Gin Gin Ala Gly Val Ser Gly Glu Gin He Lys Gly Val Gly 
65 70 75 80 

He Ser Ala Gin Gly Gin Gly Leu Phe Leu Leu Asp Lys Gin Asp Arg 
85 90 95 

Pro Leu Gly Asn Ala He Leu Ser Ser Asp Arg Arg Ala Leu Lys He 
100 105 110 

Val Gin Arg Trp Gin Arg Asp Arg He Pro Glu Arg Leu Tyr Pro Val 
115 120 125 



12 



8NSOOCI0: <WO_ 0022170A1JA> 



wo 00/22 1 70 PCT/US99/23862 

Thr Arg Gin Thr Leu Trp Thr Gly His Pro Ala Ser Leu Leu Arg Trp 
130 135 140 

Val Lys Glu Asn Glu Pro Gin Arg Tyr Ala Gin lie Gly Cys Val Met 
145 150 155 160 

Met Gly His Asp Tyr Leu Arg Trp Cys Leu Thr Gly Ala Lys Gly Cys 
165 170 175 

Glu Glu Ser Asn lie Ser Glu Ser Asn Leu Tyr Asn Met Ala Met Gly 
180 185 190 

Gin Tyr Asp Pro Arg Leu Thr Glu Trp Leu Gly lie Gly Glu He Asp 
195 200 205 

Ser Ala Leu Pro Pro Val Val Gly Ser Ala Glu He Cys Gly Glu He 
210 215 220 

Thr Ala Gin Ala Ala Ala Leu Thr Gly Leu Ala Ala Gly yhr Pro Val 
225 230 235 240 

Val Gly Gly Leu Phe Asp Val Val Ser Thr Ala Leu Cys Ala Gly He 
245 250 255 

Glu Asp Glu Ser Thr Leu Asn Ala Val Met Gly Thr Trp Ala Val Thr 
260 265 270 

Ser Gly He Ala His Gly Leu Arg Asp His Glu Ala His Pro Tyr Val 
275 280 285 

Tyr Gly Arg Tyr Val Asn Asp Gly Gin Tyr He Val His Glu Ala Ser 
290 295 300 

Pro Thr Ser Ser Gly Asn Leu Glu Trp Phe Thr Ala Gin Trp Gly Asp 
305 310 315 320 

Leu Ser Phe Asp Glu He Asn Gin Ala Val Ala Ser Leu Pro Lys Ala 
325 330 335 

Gly Ser Glu Leu Phe Phe Leu Pro Phe Leu Tyr Gly Ser Asn Ala Gly 
340 345 350 

Leu Glu Met Thr Cys Gly Phe Tyr Gly Met Gin Ala Leu His Thr Arg 
355 360 365 

Ala His Leu Leu Gin Ala Val Tyr Glu Gly Val Val Phe Ser His Met 
370 375 380 



13 
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Thr His Leu Ser Arg Met Arg Glu Arg Phe Thr Asn Val Gin Ala Leu 
385 390 395 400 



Arg Val Thr Gly Gly Pro Ala His Ser Asp Val Trp Met Gin Met Leu 
405 410 415 

Ala Asp Val Ser Gly Leu Arg He Glu Leu Pro Lys Val Glu Glu Thr 



420 



425 



4 30 



Gly Cys Phe Gly Ala Ala Leu Ala Ala Arg Val Gly Thr Gly Val Tyr 
435 440 445 

Arg Ser Phe Ser Glu Ala Arg Arg Ala Arg Gin His Pro Val Arg Thr 
450 455 460 

Leu Leu Pro Asp Met Thr Ala His Ala Arg Tyr Gin Arg Lys Tyr Arg 
465 470 475 480 

His Tyr Leu His Leu He Glu Ala Leu Gin Gly Tyr His ^la Arg He 
485 490 495 



Lys Glu His Ala Leu 
500 



<210> 16 
<211> 220 
<212> PRT 
<213> YiaQ-Ko 

<400> 16 

Met Ser Arg Pro Leu Leu Gin Leu Ala Leu Asp His Thr Ser Leu Gin 
15 10 15 

Ala Ala Gin Arg Asp Val Ala Leu Leu Gin Asp His Val Asp He Val 
20 25 30 

Glu Ala Gly Thr He Leu Cys Leu Thr Glu Gly Leu Ser Ala Val Lys 
35 40 45 

Ala Leu Arg Ala Gin Cys Pro Gly Lys He He Val Ala Asp Trp Lys 
50 55 60 

Val Ala Asp Ala Gly Glu Thr Leu Ala Gin Gin Ala Phe Gly Ala Gly 
65 70 75 80 

Ala Asn Trp Met Thr He He Cys Ala Ala Pro Leu Ala Thr Val Glu 
85 90 95 

14 
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Lys Gly His Ala Val Ala Gin Ala Cys Gly Gly Giu He 'Gin Met Glu 
100 105 110 

Leu Phe Gly Asn Trp Thr Leu Asp Asp Ala Arg Ala Trp Tyr Arg Thr 
115 120 125 

Gly Val His Gin Ala He Tyr His Arg Gly Arg Asp Ala Gin Ala Ser 
130 135 140 

Gly Gin Gin Trp Gly Glu Ala Asp Leu Ala Arg Met Lys Ala Leu Ser 

150 155 160 

Asp He Gly Leu Glu Leu Ser He Thr Gly Gly He Thr Pro Ala Asp 
165 170 175 

Leu Pro Leu Phe Lys Asp He Asn Val Lys Ala Phe He Ala Gly Arg 
180 185 190 

Ala Leu Ala Gly Ala Ala His Pro Ala Arg Val Ala Ala Glu Phe His 
195 200 205 



Ala Gin He Asp Ala He Trp Gly Glu Gin His Ala 
210 215 220 



<210> 17 
<211> 286 
<212> PRT 
<213> YiaR-Ko 



<400> 17 
Met Arg Asn His 
1 

Leu Ser Trp Pro 
20 

Phe Val Glu Met 
35 

Glu Trp Thr Pro 
50 

Thr Ala Val Ala 
65 

Pro Phe Gly Ser 



Pro Leu Gly He 
5 

Glu Arg Leu Val 



Ser Val Asp Glu 
40 

Ala Gin Arg Ala 

55 

He Pro Ser Met 
70 

Arg Asp Glu Ala 



Tyr Glu Lys Ala 
10 

Leu Ala Lys Ser 
25 

Thr Asp Glu Arg 



Ser Leu Val Ser 
60 

Cys Leu Ser Ala 
75 

Val Arg Asp Arg 



Leu Ala Lys Asp 
15 

Cys Gly Phe Asp 
30 

Leu Ser Arg Leu 
45 

Ala Met Leu Glu 



His Arg Arg Phe 
80 

Ala Arg Glu He 



15 
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95 



Met Thr Lys Ala lie Arg Leu Ala Arg Asp Leu Gly lie Arg Thr lie 
100 105 110 

Gin Leu Ala Gly Tyr Asp Val Tyr Tyr Glu Glu His Asp Glu Gly Thr 
115 120 125 

Arg Gin Arg Phe Ala Glu Gly Leu Ala Trp Ala Val Glu Gin Ala Ala 
130 135 140 

Ala Ala Gin Val Met Leu Ala Val Glu He Met Asp Thr Ala Phe Met 
145 150 155 160 

Asn Ser He Ser Lys Trp Lys Lys Trp Asp Glu Met Leu Ser Ser Pro 
165 170 175 

Trp Phe Thr Val Tyr Pro Asp Val Gly Asn Leu Ser Ala Trp Gly Asn 
180 185 1^90 

Asp Val Thr Ala Glu Leu Lys Leu Gly He Asp Arg He Ala Ala He 
195 200 205 

His Leu Lys Asp Thr Leu Pro Val Thr Asp Asp Ser Pro Gly Gin Phe 
210 215 220 

Arg Asp Val Pro Phe Gly Glu Gly Cys Val Asp Phe Val Gly He Phe 
225 230 235 240 

Lys Thr Leu Arg Glu Leu Asn Tyr Arg Gly Ser Phe Leu He Glu Met 
245 250 255 

Trp Thr Glu Lys Ala Ser Glu Pro Val Leu Glu He He Gin Ala Arg 
260 265 270 

Arg Trp He Glu Ser Arg Met Gin Glu Gly Gly Phe Thr Cys 
275 280 285 



<210> 18 
<211> 238 
<212> PRT 
<213> YiaS-Ko 

<400> 18 

Met Leu Glu Gin Leu Lys Ala Glu Val Leu Ala Ala Asn Leu Ala Leu 
15 10 15 



16 
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Pro Ala His Giy Leu Val Thr Phe Thr Trp Gly Asn Val Ser Ala Val 
20 25 -30 

Asp Glu Thr Arg Lys Leu Met Val He Lys Pro Ser Gly Val Glu Tyr 
35 40 45 

Glu Val Met Thr Ala Asp Asp Met Val Val Val Glu Met Ala Ser Gly 
50 55 60 

Lys Val Val Glu Gly Gly Lys Lys Pro Ser Ser Asp Thr Pro Thr His 
^5 70 75 80 

Leu Ala Leu Tyr Arg Arg Tyr Pro Gin He Gly Gly He Val His Thr 
85 90 95 

His Ser Arg His Ala Thr He Trp Ser Gin Ala Gly Leu Asp Leu Pro 
100 105 110 

Ala Trp Gly Thr Thr His Ala Asp Tyr Phe Tyr Gly Ala ^e Pro Cys 
115 120 125 

Thr Arg Arg Met Thr Val Glu Glu He Asn Gly Glu Tyr Glu Tyr Gin 
130 135 140 

Thr Gly Glu Val He He Lys Thr Phe Glu Gin Arg Gly Leu Asp Pro 

150 155 160 

Ala Gin He Pro Ala Val Leu Val His Ser His Gly Pro Phe Ala Trp 
165 170 175 

Gly Lys Asp Ala Ala Asp Ala Val His Asn Ala Val Val Leu Glu Glu 
180 185 190 

Cys Ala Tyr Met Gly Leu Phe Ser Arg Gin Trp Pro Gin Leu Pro Asp 
195 200 205 

Met Gin Ser Glu Leu Leu Asp Lys His Tyr Leu Arg Lys His Gly Ala 
210 215 220 



Asn Ala He Thr Gly Lys Thr Ser Pro Ala Glu Leu Pro Gly 
225 230 235 



<210> 19 
<211> 9334 
<212> DNA 
<213> yia 
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<400> 19 

ggatccgcgg gcgcaaaggc ggagacgcca 
acgcaggcga cttcacaggt acggcagccg 
cgattcatcc ttctccattg gggataaaaa 
ctccctttga tcctgaatgg agtcagcggc 
tcatttgcct taaccttccc gcacggaaaa 
atcgcgtagc ggcttgagta aatttttctc 
agagatagaa atggcataag gcacgcgccc 
cgacacgccc agctcgttct cttccctgtc 
cagttcatca tgcatcgcag gcaagccggt 
ctcctggtgt gaattccagt agctctcaac 
ctttgcccat tgccgagcag tacagcgcat 
cataccggtg gtcggctcca gcttataaat 
gaagttcacc gtctcgccgg tggccaggtt 
gataatattc agcgacgaca acgccttttg 
atagctcccc gccgccgggg caggcgtcac 
gcgatgaacg gtacttttgt tcagtcccgc 
atttggataa ttactcagga tctcaattag 
ggcaggcctc tctttatctt gcgtgttctc 
catttttgtc gcgttcagat ggtagcgcaa 
aaaacacaac tttatgattt ttatgatttt 
aaattgatcg ctatatttga aatcagattt 
atcaactctg accaggaaaa cagcaatgaa 
caatcgggta ctgctggacg cgtgcgtcgc 
gtttgcccgc accaccgaat ccggcgtcta 
catccagcag ttggataacg gcgacattat 
gctcggcgcc atcgaacagt gggatgctca 
gatgatggat cgggccattg agctggcctc 
taatgctaac cactggatgc gcggcggcag 
catcggtatc tgctggacca actccatcgc 
ccgtatcggt accaacccgc tgatcgtcgc 
tatgtcgatg tcgatgttct cctacggcat 
actgcccgtg gacggcggat tcgacgatga 
cgagaaaaat cgccgcattt tacccatggg 
gctggatatg attgccaccc tcctctccaa 
aaacagcgat gaatatggcg tttcgcagat 
cgacggcgca acccgcgacg ccaagctgca 
gcgcgccgat gaaaatgtgg cggtccgtct 
tgaaaaccgc cgcaacggca ttaccgtcga 
gtaaggagct cacccatgac agcgtatggg 
catgattttt ggtcatattg ctcaacctaa 
ggcgcttgat ttcctgcgca cgacggattt 
cgacggccaa aacatcttcg cgcaggttat 
tcgtccggag gtccaccgtc gctatctgga 
aatcggtatc gccattgata ccggcaataa 
cgatattatt ttttatcacg acagcgaaca 
ctatgcgata tttttcccgc aagatgttca 
gccgatccgc aaaatagtcg ttaaagtcgc 

18 



gaacagtcct ggtcctgctg atgggacacc 60 
atgcacttct ccgcatccgc gagaataaac 120 
cgcagagtgc cagaaaaaac ccgctttcct 180 
gttttctctc agatgtccgg gattatctgg 240 
gcccagttcg cgagaaatcg cctctgccgt 300 
tcccacctgc ttgaggcgcg atgttgatag 360 
atggatatca aaaacgggga cagccaggca 420 
catcgccata tttcgctcgc ggatctgcgc 4 80 
aatggtatta cgggtcagcg gctggataat 540 
gtagtcagga tggccaaacg ccatataaat 600 
gtgctggcca atataggcgc gcgtacgcag 660 
caggatcgcg tggtcatctt cacggctgga 720 
aagcgcctca agatgcggcg ccgcgacgtg 780 
gccaacgcgg ataaattttg tcgtcagcgc 840 
gtacccgcag gactgcagcc cctgtaataa 900 
cagttccgac agatgcgcca cgggacagcc 960 
catcaaccca cgaaaaaggc tctgacttcc 1020 
gctttctttt gtgcccatfg cttccgctcc 1080 
agtgtgtttc agttcacgat ctgaaccgaa 1140 
taaaaataac gctgcccgtt gatctgacaa 1200 
cgcatagtga aatttagaga taaaaaagcg 1260 
agtcacgttt gagcagttaa aagaggcatt 1320 
ccgggaaacc gccgatgcct gcgcagaaat 1380 
ttctcacggc gtgaaccgct ttcctcgctt 1440 
ccctgaggct caaccgcagc gggtgaccac 1500 
gcgttccatc ggcaacctga cggcgaaaaa 1560 
cgatcacggt atcggcctgg tcgccttacg 1620 
ctacggctgg caggcggcgg aaaaaggcta 1680 
cgttatggcg ccatggggcg ctaaagagtg 1740 
cattccgtcg acgccgatca ccatggtgga 1800 
gctggaggtt aaccgccttg ccggccgcga 1860 
cggtcgtttg accaaagagc cggggacgat 1920 
ctactggaaa ggttccggcc tgtcgatcgt 1980 
cggatcgtcg gttgccgaag tgacccagga 2040 
cttcatcgct attgaagtgg ataagctgat 2100 
acggattatg gatttcatca ccaccgccga 2160 
tcctggccat gaatttaccc gtctgctgga 2220 
tgacagcgta tgggccaaaa ttcaggcgct 2280 
ccaaaattca ggcgctgtaa ggagctcacc 2340 
tccgtgtcgt ctgcccgcgg ccattgagcg 2400 
ccacgcgctg gcacccggcg tcgtggaaat 24 60 
cgacttaacc actcgcgatg ccgctgaaaa 2520 
tatccagttt ctggcatcgg gcgaagaaaa 2580 
tcaaatcagc gaatctttat tagaacagcg 2640 
tgaatcgttc tttgaaatga cgccaggcaa 2700 
tcgtcctgga tgtaataaaa ctgtagccac 2760 
tatttcagtt ttataagaag gagcacaaaa 2820 
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tgaattcgaa taataccggt tacattatcg gtgcgtaccc ctgtgccccc tgtgcaccct 2880 
catttcacca aaagagtgaa gaggaagaga tggaattctg gcggcagctc tccgacaccc 2940 
cggatattcg cgggctggag caaccctgcc taccctgcct tgaacatctt catccgctcg 3000 
gcgacgagtg gttattgcgc cataccccgg gacactggca gattgtcgtt accgccatca 3060 
tggaaaccat gcgccgccgc ggtgaaaacg gcggctttgg gctggcgtcc agcgacgaaa 3120 
cgcagcgcaa agcctgcgtg gagtactatc gccacctgca gcagaagatc gctaaaatca 3180 
atggcaatac cgccggaaag gtcattgccc ttgagcttca cgccgccccg ctggcgggca 3240 
atgccaacgt ggctcaggct accgacgcct ttgcccgttc attaaaagaa attacccgct 3300 
gggactggtc ctgcgagctg gtgctggagc actgcgacgc gatgaccggc agcgcgccgc 3360 
gcaaaggatt tttgccgtta gaaaacgtgc tggaagccat tgccgattat gacgttggca 3420 
tttgtattaa ctgggcgcgt tcggccattg aagggcggaa taccgtgcta ccgctcaccc 3480 
atacgcagca ggtaaaacgg gcaggaaagc tcggcgcgct gatgttttct ggcacgacgc 3540 
agaccggcga gtacggcgaa tggcaggatt tacacgcgcc gttcgcgcct ttctgcccgc 3600 
agagcctgat gaccaccgaa cacgctcgtg aattatttgc ctgcgcagga accgcccccc 3660 
tgcaattttc aggcattaaa ttactggaaa ttaatgccag cgcaaacgtt gatcatcgca 3720 
tcgcgatatt acgcgacggc atctccgcgc taaaacaagc acaataataa taatcacctt 3780 
catcaccaga atatttttaa tattacgaga ctataaagat gaatataacc tctaactcta 3840 
caaccaaaga tataccgcgc cagcgctggt taagaatcat tccgcctata ctgatcactt 3900 
gtattatttc ttatatggac cgggtcaata ttgcctttgc gatgcccgga ggtatggatg 3960 
ccgacttagg tatttccgcc accatggcgg ggctggcggg cggtattttc tttatcggtt 4020 
atctattttt acaggttccc ggcgggaaaa ttgccgttca cggtagcggt aagaaattta 4080 
tcggctggtc gctggtcgcc tgggcggtca tctccgtgct gacggggtta attaccaatc 4140 
agtaccagct gctggccctg cgcttcttac tgggcgtggc ggaaggcggt atgctgccgg 4200 
tcgttctcac gatgatcagt aactggttcc ccgacgctga acgcggtcgc gccaacgcga 4260 
ttgtcattat gtttgtgccg attgccggga ttatcaccgc cccactctca ggctggatta 4320. 
tcacggttct cgactggcgc tggctgttta ttatcgaagg tttgctctcg ctggttgttc 4380 
tggttctgtg ggcatacacc atctatgacc gtccgcagga agcgcgctgg atttccgaag 4440 
cagagaagcg ctatctggtc gagacgctgg ccgcggagca aaaagccatt gccggcaccg 4500 
aggtgaaaaa cgcctctctg agcgccgttc tctccgacaa aaccatgtgg cagcttatcg 4560 
ccctgaactt cttctaccag accggcattt acggctacac cctgtggcta cccaccattc 4620 
tgaaagaatt gacccatagc agcatggggc aggtcggcat gcttgccatt ctgccgtacg 4680 
tcggcgccat tgctgggatg ttcctgtttt cctccctttc agaccgaacc ggtaaacgca 4740 
agctgttcgt ctgcctgccg ctgattggct tcgctctgtg catgttcctg tcggtggcgc 4800 
tgaaaaacca aatttggctc tcctatgccg cgctggtcgg ctgcggattc ttcctgcaat 4860 
cggcggctgg cgtgttctgg accatcccgg cacgtctgtt cagcgcggaa atggcgggcg 4 920 
gcgcgcgcgg ggttatcaac gcgcttggca acctcggcgg attttgtggc ccttatgcgg 4980 
tcggggtgct gatcacgttg tacagcaaag acgctggcgt ctattgcctg gcgatctccc 504 0 
tggcgctggc cgcgctgatg gcgctgctgc tgccggcgaa atgcgatgcc ggtgctgcgc 5100 
cggtaaagac gataaatcca cataaacgca ctgcgtaaac tcgagcccgg cggcgctgcg 5160 
cctgccgggc ctgcgaaata tgccgggttc acccggtaac aatgagatgc gaaagatgag 5220 
caagaaacag gccttctggc tgggtattga ttgcggcggc acctatctga aagccggttt 5280 
atatgacgcc gaaggtcatg aacatggcat tgtgcggcaa gcgctacgga cgatgtcgcc 5340 
cctgccgggt tacgccgaac gcgacatgcg ccagctctgg caacactgcg cggcgaccat 5400 
tgccgggcta ttacagcagg caggtgtatc cggcgaacag attaaaggcg tgggcatctc 54 60 
cgctcagggt caagggctct ttctcctcga taagcaggat cggccgctgg gtaacgccat 5520 
cctctcctcc gatcgtcggg cgctgaaaat cgttcagcgc tggcagcggg accgtattcc 5580 
cgaacggctc tatcccgtta cccgccagac gctgtggacc ggacatccgg cttctttgct 5640 
gcgctgggta aaagagaatg aaccccagcg ctacgcgcaa attggctgcg tgatgatggg 5700 
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gcatgactat ctgcgctggt gcttaaccgg 
cgagtccaac ctctacaaca tggccatggg 
gggcatcggt gaaatcgata gcgcgctgcc 
ggagatcacc gctcaggcag ccgctttaac 
cggcctgttt gacgtggtct ccaccgccct 
caatgcggtg atggggacct gggccgtcac 
tgaggcccac ccttacgtct atggccgcta 
agccagcccg acctcatccg gcaacctcga 
gtttgatgag atcaatcagg ccgtcgccag 
tctgccgttt ctgtatggca gcaacgccgg 
gcaggcgctg catacccgcg cgcacctgct 
ccatatgacc cacctcagcc gtatgcgcga 
caccggcggc ccggcgcact ccgacgtctg 
acgcattgaa ctcccgaagg tggaagagac 
tgtcggtacc ggcgtatacc gcagctttag 
gcgcacgctg ctgcccgata tgaccgccca 
cctgcatttg attgaagcac tacagggcta 
agccgaccat tactgcagct ggcgctcgac 
gtcgccctgc tacaggatca cgttgatatt 
gaagggctta gcgcggttaa agccctgcgc 
gactggaaag tcgccgacgc cggtgaaacc 
aactggatga ccatcatttg cgccgcaccg 
gcccaggcct gcggcggtga aattcagatg 
gcccgcgcct ggtaccgtac cggcgtccat 
caggccagcg ggcagcagtg gggggaggcg 
attggccttg agctatcgat taccggcggc 
gatatcaacg tcaaagcctt tattgccggg 
cgggttgccg ccgaattcca cgcgcaaatc 
ccacccgtta ggtatttatg aaaaagcgct 
ggtactggcc aaaagctgcg gttttgattt 
acgcctttcg cgcctggagt ggaccccggc 
ggaaaccgcg gtcgccattc cctcgatgtg 
cagccgcgat gaagcggtac gcgatcgggc 
ggcgcgcgat ctggggatcc gcaccatcca 
gcatgatgaa ggcacccggc agcgttttgc 
cgccgccgcg caggtaatgc tggcggtgga 
cagcaaatgg aaaaagtggg acgagatgct 
cgtcggcaac ctcagcgcct ggggaaacga 
tcgtatcgcc gccatccacc tgaaagatac 
gttccgcgac gtgccgttcg gcgaaggatg 
gcgcgagctg aactaccgcg gttcattttt 
gccggtgctg gagattatcc aggcccggcg 
attcacatgt tagaacaact gaaagccgag 
cacggcctgg tcacctttac ctggggcaac 
atggtcatta agccttccgg cgtcgaatat 
gtagagatgg ccagcggtaa agtcgttgaa 
acgcatctgg cgctttatcg ccgctatccg 
cgccacgcga cgatctggtc gcaggccggg 

20 



cgcgaagggc tgcgaggaga gcaacatctc 57 60 
ccagtacgac ccgcgcctga ccgagtggct 5820 
ccccgttgta gggtcagccg aaatttgcgg 5880 
cggtctggcg gcgggtactc ccgtcgttgg 5940 
ttgcgccggg attgaggatg agtcgaccct 6000 
tagcggtatc gctcacggcc tgcgcgacca 6060 
cgtcaatgac ggccagtata tcgttcacga 6120 
atggtttacc gcccagtggg gcgatctctc 6180 
cctgccgaaa gccgggagcg agctgttttt 6240 
gctggagatg acctgcggct tttacggcat 6300 
gcaggcggtt tatgaaggcg tggtatttag 6360 
acgctttaca aacgttcagg ccctgcgcgt 6420 
gatgcagatg ctggcggacg taagcggctt 64 80 
cggctgtttt ggcgcggccc tcgccgctcg 6540 
cgaagccegg cgcgcccggc agcacccggt 6600 
cgcgcgctat cagcgcaaat accgccacta 6660 
tcacgcccgt attaaggagc acgcattatg 6720 
cataccagcc ttcaggctgc gcagcgcgat 6780 
gtggaggcgg gaaccatc^t ctgcttaacc 6840 
gcccagtgtc cggggaagat catcgtcgcc 6900 
ctggcgcagc aggcctttgg cgctggcgcc 6960 
ctcgccacgg tcgagaaagg ccacgccgtg 7020 
gagctgttcg gcaactggac gctggatgac 7080 
caggcgattt accatcgcgg acgcgatgcc 7140 
gatctggcgc gcatgaaagc gctgtccgat 7200 
attaccccag ccgatctacc gctgttcaaa 7260 
cgcgcgctgg caggcgccgc ccatccggcg 7320 
gacgctatct ggggagaaca gcatgcgtaa 7380 
ggcgaaggat ctcagctggc ctgagcggct 74 4 0 
tgtcgaaatg tcggtggacg agaccgatga 7500 
ccagcgcgca tcgctggtga gcgcgatgct 7560 
cttgtccgcc catcgccgtt tcccctttgg 7620 
gcgagagatt atgaccaaag ccatccgcct 7 680 
gctggcgggt tacgacgtct attacgaaga 7740 
cgaagggctg gcctgggcgg tagaacaggc 7800 
gatcatggac accgccttta tgaactccat 7860 
ttcgtcaccg tggtttaccg tctacccgga 7920 
cgtcaccgcc gagctgaagc tgggcatcga 7980 
gctgcccgtg accgacgata gccctggcca 8040 
cgtcgatttt gtcggcattt ttaagacgct 8100 
gattgagatg tggacggaga aagccagcga 8160 
ctggatcgaa tcacggatgc aggaaggggg 8220 
gtactggcgg caaacctggc cctccccgca 8280 
gtcagcgcgg tcgatgaaac gcgcaagctg 8340 
gaggtgatga ccgccgacga tatggtggtc 8400 
ggcggtaaaa aaccctcttc agatacgcca 84 60 
cagatcggcg ggatcgtgca tacccactcc 8520 
ctcgatctcc ccgcctgggg caccacccac 8580 
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gccgactact tctatggcgc gatcccctgt acccgacgga tgaccgttga ggagattaac 8640 
ggcgagtatg agtatcagac cggcgaggtg attatcaaaa cctttgaaca gcgcggcctg 8700 
gatccggcgc aaatcccggc ggtattggtc cattcacacg gcccctttgc ctggggtaaa 8760 
gacgccgccg acgccgtaca taacgccgtg gtgctggagg agtgcgccta catgggcctc 8820 
ttctcgcgcc agtggccaca gctgccggat atgcagtctg aactgctcga taaacactat 8880 
ctgcgtaaac acggcgcgaa cgctattacg ggcaaaacta gtcccgcgga actccccgga 8940 
taaggcgctt tggcccccgg gggaagcgtg caggatgttg ctgaactttc ccggagcgat 9000 
gctgcgcatc tgtccgggct acgcgtcccc ggcgctctgc ggtcagcacc gcgcccggcg 9060 
gaaaacccat caaccctacg ccgaattaat atgtccttgc agtaacgacg cttccacgcc 9120 
gccggtccag gctggtgtgc ttgcggaaaa tcttgcgaaa atagccgaca tcgttaaacc 9180 
cgcatttcat cgccacctcg gtaatcgaca gggaatcgct gataagcagc ttttccgccg 9240 
cccttacccg ctgacggtgc agcgcttcgg taacgtcagc cggaaagcat ggcgataaac 9300 
ggccccagat aacccgcgtt gcagtgcagc tcct 9334 

<210> 20 
<211> 282 
<212> PRT 
<213> YiaJ-Ec 

<400> 20 

Met Gly Lys Glu Val Met Gly Lys Lys Glu Asn Glu Met Ala Gin Glu 
i 5 10 15 

Lys Glu Arg Pro Ala Gly Ser Gin Ser Leu Phe Arg Gly Leu Met Leu 
20 25 30 

He Glu He Leu Ser Asn Tyr Pro Asn Gly Cys Pro Leu Ala His Leu 
35 40 45 

Ser Glu Leu Ala Gly Leu Asn Lys Ser Thr Val His Arg Leu Leu Gin 
50 55 60 

Gly Leu Gin Ser Cys Gly Tyr Val Thr Thr Ala Pro Ala Ala Gly Ser 
65 70 75 80 

Tyr Arg Leu Thr Thr Lys Phe He Ala Val Gly Gin Lys Ala Leu Ser 
85 90 95 

Ser Leu Asn He He His He Ala Ala Pro His Leu Glu Ala Leu Asn 
100 105 110 

He Ala Thr Gly Glu Thr He Asn Phe Ser Ser Arg Glu Asp Asp His 
115 120 125 

Ala He Leu He Tyr Lys Leu Glu Pro Thr Thr Gly Met Leu Arg Thr 
130 135 140 

Arg Ala Tyr He Gly Gin His Met Pro Leu Tyr Cys Ser Ala Met Gly 
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155 
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160 



Lys He Tyr Met Ala Phe Gly His 
165 

Glu Ser His Gin His Glu He Gin 
180 

Glu Leu Pro Ala Met Phe Asp Glu 
195 . 200 

Ala Ala Met Asp Arg Glu Glu Asn 
210 215 

Val Pro Val Phe Asp He His Gly 
225 230 

Ser Leu Ser Thr Ser Arg Leu Lys 
245 

Lys Pro Leu Arg Glu Thr Ala Gin 
260 

Thr Val Arg Asp Asp Leu Gly Ala 
275 280 



Pro Asp Tyr Val Lys Ser Tyr Trp 
170 175 

Pro Leu Thr Arg Asn Thr He Thr 
185 190 

Leu Ala His He Arg Glu Ser Gly 
205 

Glu Leu Gly Val Ser Cys He Ala 
220 

Arg Val Pro Tyr Ala Val Ser He 

235 240 

Gin Val Gly Glu Lys Asn Leu Leu 
250 X. 255 

Ala He Ser Asn Glu Leu Gly Phe 
265 270 

He Thr 



<210> 21 
<211> 268 
<212> PRT 
<213> YiaJ-Hi 

<400> 21 

Met Asn He Glu Val Lys Met Glu Lys Glu Lys Ser Leu Gly Asn Gin 
15 10 15 

Ala Leu He Arg Gly Leu Arg Leu Leu Asp He Leu Ser Asn Tyr Pro 
20 25 30 

Asn Gly Cys Pro Leu Ala Lys Leu Ala Glu Leu Ala Asn Leu Asn Lys 
35 40 45 

Ser Thr Ala His Arg Leu Leu Gin Gly Leu Gin Asn Glu Gly Tyr Val 
50 55 60 

Lys Pro Ala Asn Ala Ala Gly Ser Tyr Arg Leu Thr He Lys Cys Leu 
65 70 75 80 
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Ser lie Gly Gin Lys Val Leu Ser Ser Met Asn He He His Val Ala 
85 90 95 

Ser Pro Tyr Leu Glu Gin Leu Asn Leu Lys Leu Gly Glu Thr He Asn 
100 105 110 

Phe Ser Lys Arg Glu Asp Asp His Ala He Met He Tyr Lys Leu Glu 
115 120 125 

Pro Thr Asn Gly Met Leu Lys Thr Arg Ala Tyr He Gly Gin Tyr Leu 
130 135 140 

Lys Leu Tyr Cys Ser Ala Met Gly Lys He Phe Leu Ala Tyr Glu Lys 
145 150 155 160 

Lys Val Asp Tyr Leu Ser His Tyr Trp Gin Ser His Gin Arg Glu He 
165 170 175 

Lys Lys Leu Thr Arg Tyr Thr He Thr Glu Leu Asp Asp ^le Lys Leu 
180 185 190 

Glu Leu Glu Thr He Arg Gin Thr Ala Tyr Ala Met Asp Arg Glu Glu 
195 200 205 

Asn Glu Leu Gly Val Thr Cys He Ala Cys Pro He Phe Asp Ser Phe 
210 215 220 

Gly Gin Val Glu Tyr Ala He Ser Val Ser Met Ser He Tyr Arg Leu 
225 230 235 240 

Asn Lys Phe Gly Thr Asp Ala Phe Leu Gin Glu He Arg Lys Thr Ala 
245 250 255 

Glu Gin He Ser Leu Glu Leu Gly Tyr Glu Asn He 
260 265 



<210> 22 
<211> 332 
<212> PRT 
<213> YiaK-Ec 

<400> 22 

Met Lys Val Thr Phe Glu Gin Leu Lys Ala Ala Phe Asn Arg Val Leu 
15 10 15 

He Ser Arg Gly Val Asp Ser Glu Thr Ala Asp Ala Cys Ala Glu Met 
20 25 30 
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Phe Ala Arg Thr Thr Glu Ser Gly Val Tyr Ser His Gly Val Asn Arg 
35 40 45 

Phe Pro Arg Phe lie Gin Gin Leu Glu Asn Gly Asp He He Pro Asp 
50 55 60 

Ala Gin Pro Lys Arg He Thr Ser Leu Gly Ala He Glu Gin Trp Asp 
65 70 75 80 

Ala Gin Arg Ser He Gly Asn Leu Thr Ala Lys Lys Met Met Asp Arg 
85 90 95 

Ala He Glu Leu Ala Ala Asp His Gly He Gly Leu Val Ala Leu Arg 
100 105 HO 

Asn Ala Asn His Trp Met Arg Gly Gly Ser Tyr Gly Trp Gin Ala Ala 
115 120 125 

Glu Lys Gly Tyr He Gly He Cys Trp Thr Asn Ser He Ala Val Met 
130 135 140 

Pro Pro Trp Gly Ala Lys Glu Cys Arg He Gly Thr Asn Pro Leu He 
145 150 155 160 

Val Ala He Pro Ser Thr Pro He Thr Met Val Asp Met Ser Met Ser 
165 170 175 

Met Phe Ser Tyr Gly Met Leu Glu Val Asn Arg Leu Ala Gly Arg Gin 
180 185 190 

Leu Pro Val Asp Gly Gly Phe Asp Asp Glu Gly Asn Leu Thr Lys Glu 
195 200 205 

Pro Gly Val He Glu Lys Asn Arg Arg He Leu Pro Met Gly Tyr Trp 
210 215 220 

Lys Gly Ser Gly Met Ser He Val Leu Asp Met He Ala Thr Leu Leu 
225 230 235 240 

Ser Asp Gly Ala Ser Val Ala Glu Val Thr Gin Asp Asn Ser Asp Glu 
245 250 255 

Tyr Gly He Ser Gin He Phe He Ala He Glu Val Asp Lys Leu He 
260 265 270 

Asp Gly Pro Thr Arg Asp Ala Lys Leu Gin Arg He Met Asp Tyr Val 
275 280 285 
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Thr Ser Ala Giu Arg Ala Asp Glu 
290 295 

His Glu Phe Thr Thr Leu Leu Ala 
305 310 

Val Asp Asp Ser Val Trp Ala Lys 
325 



Asn Gin Ala lie Arg "Leu Pro Gly 
300 

Glu Asn Arg Arg Asn Gly lie Thr 
315 320 

lie Gin Ala Leu 
330 



<210> 23 
<211> 332 
<212> PRT 
<213> YiaK-Hi 

<400> 23 

Met Arg Val Ser Tyr Asp Glu Leu Lys Asn Glu Phe Lys Arg Val Leu 
1 5 10 ^ 15 

Leu Asp Arg Gin Leu Thr Glu Gia Leu Ala Glu Glu Cys Ala Thr Ala 
20 25 30 

Phe Thr Asp Thr Thr Gin Ala Gly Ala Tyr Ser His Gly lie Asn Arg 
35 40 45 

Phe Pro Arg Phe He Gin Gin Leu Glu Gin Gly Asp He Val Pro Asn 
50 55 60 

Ala He Pro Thr Lys Val Leu Ser Leu Gly Ser He Glu Gin Trp Asp 
65 70 75 80 

Ala His Gin Ala He Gly Asn Leu Thr Ala Lys Lys Met Met Asp Arg 
85 90 95 

Ala He Glu Leu Ala Ser Gin His Gly Val Gly Val He Ala Leu Arg 
100 105 110 

Asn Ala Asn His Trp Met Arg Gly Gly Ser Tyr Gly Trp Gin Ala Ala 
115 120 125 

Glu Lys Gly Tyr He Gly He Cys Trp Thr Asn Ala Leu Ala Val Met 
130 135 140 

Pro Pro Trp Gly Ala Lys Glu Cys Arg He Gly Thr Asn Pro Leu He 
145 150 155 160 

He Ala . Val Pro Thr Thr Pro He Thr Met Val Asp Met Ser Cys Ser 
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170 
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175 



Met Tyr Ser Tyr Gly Met Leu Glu Val His Arg Leu Ala Gly Arg Gin 
180 185 190 

Thr Phe Val Asp Ala Gly Phe Asp Asp Glu Gly Asn Leu Thr Arg Asp 
195 200 205 

Pro Ser lie Val Glu Lys Asn Arg Arg Leu Leu Pro Met Gly Phe Trp 
210 215 220 

Lys Gly Ser Gly Leu Ser He Val Leu Asp Met He Ala Thr Leu Leu 
225 230 235 240 

Ser Asn Gly Glu Ser Thr Val Ala Val Thr Glu Asp Lys Asn Asp Glu 
245 250 255 

Tyr Cys Val Ser Gin Val Phe He Ala He Glu Val Asp Arg Leu He 
260 265 ^270 

Asp Gly Lys Ser Lys Asp Glu Lys Leu Asn Arg He Met Asp Tyr Val 
275 280 285 

Lys Thr Ala Glu Arg Ser Asp Pro Thr Gin Ala Val Arg Leu Pro Gly 
290 295 300 

His Glu Phe Thr Thr He Leu Ser Asp Asn Gin Thr Asn Gly He Pro 
305 310 315 320 

Val Asp Glu Arg Val Trp Ala Lys Leu Lys Thr Leu 
325 330 



<210> 24 
<211> 155 
<212> PRT 
<213> YiaL-Ec 

<400> 24 

Met He Phe Gly His He Ala Gin Pro Asn Pro Cys Arg Leu Pro Ala 
1 5 10 



15 



Ala He Glu Lys Ala Leu Asp Phe Leu Arg Ala Thr Asp Phe Asn Ala 
20 25 30 



Leu Glu Pro Gly Val Val Glu He Asp Gly Lys Asn He Tyr Thr Gin 
35 40 45 
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He He Asp Leu 
50 

His Arg Arg Tyr 
65 

He Gly He Ala 



Leu Glu Gin Arg 
100 

Phe He Glu Met 
115 

Val His Arg Pro 
130 

He Val Val Lys 
145 



Thr Thr Arg Glu 
55 

He Asp He Gin 
70 

He Asp Thr Gly 
85 

Asn He He Phe 



He Pro Gly Ser 
120 

Gly Cys He Met 
135 

Val Ala Leu Thr 
150 



Ala Val Val Asn 
60 

Phe Leu Ala Trp 
75 

Asn Asn Lys Val 
90 

Tyr His Asp Ser 
105 

Tyr Ala He Phe 



Gin Thr Ala Ser 
140 

Ala Leu Asn 
155 



Arg Pro Glu Val 



Gly Glu Glu Lys 
80 

Ser Glu Ser Leu 
95 

Glu His Glu Ser 
110 

Phe Pro Gin Asp 
125 

Glu He Arg Lys 



<210> 25 

<211> 155 

<212> PRT 

<213> YiaL-Hi 

<400> 25 
Met He He Ser 
1 

Lys Val He Ala 
20 

Ala Leu Glu Asn 
35 

Val Met Glu Pro 
50 

His His Glu Tyr 
65 

He Glu Val Gly 



Asn Glu Ala Asp 
100 



Ser Leu Thr Asn 
5 

Glu Val Cys Asp 



Gly Arg His Asp 
40 

Glu Thr Ala Glu 
55 

Leu Asp Val Gin 
70 

Ala Thr Tyr Pro 
85 

Asp Tyr Gin Leu 
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Pro Asn Phe Lys 
10 

Tyr Leu Asn Thr 
25 

He Asn Asp Gin 



Pro Ser Ser Lys 
60 

Val Leu He Arg 
75 

Asn Leu Ser Lys 
90 

Cys Ala Asp He 
105 



Val Gly Leu Pro 
15 

Leu Asp Leu Asn 
30 

He Tyr Met Asn 
45 

Lys Ala Glu Leu 



Gly Thr Glu Asn 
80 

Tyr Glu Asp Tyr 
95 

Asp Asp Lys Phe 
110 
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Thr Val Thr Met Lys Pro Lys Met Phe Ala Val Phe Tyr' Pro Tyr Glu 
115 120 125 

Pro His Lys Pro Cys Cys Val Val Asn Gly Lys Thr Glu Lys He Lys 
130 135 140 



Lys Leu Val Val Lys Val Pro Val Lys Leu He 
145 150 155 



<210> 26 
<211> 498 
<212> PRT 
<213> LyxK-Ec 



<400> 26 

Met Thr Gin Tyr Trp Leu Gly Leu Asp Cys Gly Gly Ser Trp Leu Lys 



10 



15 



Ala Gly Leu Tyr Asp Arg Glu Gly Arg Glu Ala Gly Val Gin Arg Leu 
20 25 30 

Pro Leu Cys Ala Leu Ser Pro Gin Pro Gly Trp Ala Glu Arg Asp Met 
35 40 45 

Ala Glu Leu Trp Gin Cys Cys Met Ala Val He Arg Ala Leu Leu Thr 
50 55 60 

His Ser Gly Val Ser Gly Glu Gin He Val Gly He Gly He Ser Ala 

75 80 



65 



70 



Gin Gly Lys Gly Leu Phe Leu Leu Asp Lys Asn Asp Lys Pro Leu Gly 
85 90 95 

Asn Ala He Leu Ser Ser Asp Arg Arg Ala Met Glu He Val Arg Arg 
100 105 110 

Trp Gin Glu Asp Gly He Pro Glu Lys Leu Tyr Pro Leu Thr Arg Gin 
115 120 125 

Thr Leu Trp Thr Gly His Pro Val Ser Leu Leu Arg Trp Leu Lys Glu 
130 135 140 

His Glu Pro Glu Arg Tyr Ala Gin He Gly Cys Val Met Met Thr His 
145 150 155 160 

Asp Tyr Leu Arg Trp Cys Leu Thr Gly Val Lys Gly Cys Glu Glu Ser 
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165 170 175 

Asn lie Ser Glu Ser Asn Leu Tyr Asn Met Ser Leu Gly Glu Tyr Asp 
180 185 190 

Pro Cys Leu Thr Asp Trp Leu Gly He Ala Glu He Asn His Ala Leu 
195 200 205 

Pro Pro Val Val Gly Ser Ala Glu He Cys Gly Glu He Thr Ala Gin 
210 215 220 

Thr Ala Ala Leu Thr Gly Leu Lys Ala Gly Thr Pro Val Val Gly Gly 
225 230 235 240 

Leu Phe Asp Val Val Ser Thr Ala Leu Cys Ala Gly He Glu Asp Glu 
245 250 255 

Phe Thr Leu Asn Ala Val Met Gly Thr Trp Ala Val Thr Ser Gly He 
260 265 ,^70 

Thr Arg Gly Leu Arg Asp Gly Glu Ala His Pro Tyr Val Tyr Gly Arg 
275 280 285 

Tyr Val Asn Asp Gly Glu Phe He Val His Glu Ala Ser Pro Thr Ser 
290 295 300 

Ser Gly Asn Leu Glu Trp Phe Thr Ala Gin Trp Gly Glu He Ser Phe 
305 310 315 320 

Asp Glu He Asn Gin Ala Val Ala Ser Leu Pro Lys Ala Gly Gly Asp 
325 330 335 

Leu Phe Phe Leu Pro Phe Leu Tyr Gly Ser Asn Ala Gly Leu Glu Met 
340 345 350 

Thr Ser Gly Phe Tyr Gly Met Gin Ala He His Thr Arg Ala His Leu 
355 360 365 

Leu Gin Ala He Tyr Glu Gly Val Val Phe Ser His Met Thr His Leu 
370 375 380 

Asn Arg Met Arg Glu Arg Phe Thr Asp Val His Thr Leu Arg Val Thr 
385 390 395 400 

Gly Gly Pro Ala His Ser Asp Val Trp Met Gin Met Leu Ala Asp Val 
405 410 415 

Ser Gly Leu Arg He Glu Leu Pro Gin Val Glu Glu Thr Gly Cys Phe 
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420 



425 430 



Gly Ala Ala Leu Ala Ala Arg Val Gly Thr Gly Val Tyr His Asn Phe 
435 440 445 

Ser Glu Ala Gin Arg Asp Leu Arg His Pro Val Arg Thr Leu Leu Pro 
450 455 460 

Asp Met Thr Ala His Gin Leu Tyr Gin Lys Lys Tyr Gin Arg Tyr Gin 
465 470 475 480 

His Leu He Ala Ala Leu Gin Gly Phe His Ala Arg He Lys Glu His 
485 490 495 

Thr Leu 



<210> 27 » 
<211> 485 
<212> PRT 
<213> LyxK-Hi 

<400> 27 

Met His Tyr Tyr Leu Gly He Asp Cys Gly Gly Thr Phe He Lys Ala 
15 10 15 

Ala He Phe Asp Gin Asn Gly Thr Leu Gin Ser He Ala Arg Arg Asn 
20 25 30 

He Pro He He Ser Glu Lys Pro Gly Tyr Ala Glu Arg Asp Met Asp 
35 40 45 

Glu Leu Trp Asn Leu Cys Ala Gin Val He Gin Lys Thr He Arg Gin 
50 55 60 

Ser Ser He Leu Pro Gin Gin He Lys Ala He Gly He Ser Ala Gin 
65 70 75 80 

Gly Lys Gly Ala Phe Phe Leu Asp Lys Asp Asn Lys Pro Leu Gly Arg 
85 90 95 

Ala He Leu Ser Ser Asp Gin Arg Ala Tyr Glu He Val Gin Cys Trp 
100 105 110 

Gin Lys Glu Asn He Leu Gin Lys Phe Tyr Pro He Thr Leu Gin Thr 
115 120 125 
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Leu Trp Met Gly His Pro Val Ser lie Leu Arg Trp lie Lys Glu Asn 
130 135 140 

Glu Pro Ser Arg Tyr Glu Gin lie His Thr lie Leu Met Ser His Asp 
145 150 155 160 

Tyr Leu Arg Phe Cys Leu Thr Glu Lys Leu Tyr Cys Glu Glu Thr Asn 
165 170 175 

lie Ser Glu Ser Asn Phe Tyr Asn Met Arg Glu Gly Lys Tyr Asp lie 
180 185 190 

Gin Leu Ala Lys Leu Phe Gly He Thr Glu Cys He Asp Lys Leu Pro 
195 200 205 

Pro He He Lys Ser Asn Lys He Ala Gly Tyr Val Thr Ser Arg Ala 
210 215 220 

Ala Glu Gin Ser Gly Leu Val Glu Gly He Pro Val Val fly Gly Leu 
225 230 235 240 

Phe Asp Val Val Ser Thr Ala Leu Cys Ala Asp Leu Lys Asp Asp Gin 
245 250 255 

His Leu Asn Val Val Leu Gly Thr Trp Ser Val Val Ser Gly Val Thr 
260 265 270 

His Tyr He Asp Asp Asn Gin Thr He Pro Phe Val Tyr Gly Lys Tyr 
275 280 285 

Pro Glu Lys Asn Lys Phe He He His Glu Ala Ser Pro Thr Ser Ala 
290 295 300 

Gly Asn Leu Glu Trp Phe Val Asn Gin Phe Asn Leu Pro Asn Tyr Asp 
305 310 315 320 

Asp He Asn His Glu He Ala Lys Leu Lys Pro Ala Ser Ser Ser Val 
325 330 335 

Leu Phe Ala Pro Phe Leu Tyr Gly Ser Asn Ala Lys Leu Gly Met Gin 
340 345 350 

Ala Gly Phe Tyr Gly He Gin Ser His His Thr Gin He His Leu Leu 
355 360 365 



Gin Ala He Tyr Glu Gly Val He Phe Ser Leu Met Ser His Leu Glu 
370 375 380 
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Arg Met Gin Val 
385 

Gly Pro Ala Lys 



Gly Met Arg Leu 
420 

Ala Ala Leu Met 
435 

lie Leu Asn lie 
450 

Ser Lys Tyr Gin 
4 65 

Leu Lys Asn Leu 



Arg Phe Pro Asn 
390 

Ser Glu Val Trp 
405 

Glu lie Pro Asn 



Ala Met Gin Ala 
440 

Asp Arg Lys lie 
455 

His Lys Tyr His 
470 

Asp 
485 



Ala Ser Thr Val 
395 

Met Gin Met Leu 
410 

lie Glu Glu Thr 
425 

Glu Ser Ala Val 



Phe Leu Pro Asp 
460 

Arg Tyr Leu Lys 
475 



Arg Val Thr Gly 
400 

Ala Asp He Ser 
415 

Gly Cys Leu Gly 
430 

Glu He Ser Gin 
445 

Lys Asn Gin Tyr 



Phe He Glu Ala 
480 



<210> 28 
<211> 220 
<212> PRT 
<213> YiaQ-Ec 

<400> 28 

Met Ser Arg Pro Leu Leu Gin Leu Ala Leu Asp His Ser Ser Leu Glu 
15 10 15 

Ala Ala Gin Arg Asp Val Thr Leu Leu Lys Asp Ser Val Asp He Val 
20 25 30 

Glu Ala Gly Thr He Leu Cys Leu Asn Glu Gly Leu Gly Ala Val Lys 
35 40 45 

Ala Leu Arg Glu Gin Cys Pro Asp Lys He He Val Ala Asp Trp Lys 
50 55 60 

Val Ala Asp Ala Gly Glu Thr Leu Ala Gin Gin Ala Phe Gly Ala Gly 
65 -70 75 80 

Ala Asn Trp Met Thr He He Cys Ala Ala Pro Leu Ala Thr Val Glu 
85 90 95 

Lys Gly His Ala Met Ala Gin Arg Cys Gly Gly Glu He Gin He Glu 
100 105 110 

32 



BNSDOCID: <WO 0022170A1_IA> 



wo 00/22170 



PCT/IJS99/23862 



Leu Phe Gly Asn Trp Thr Leu Asp Asp Ala Arg Asp Trp "His Arg lie 
115 120 125 

Gly Val Arg Gin Ala lie Tyr His Arg Gly Arg Asp Ala Gin Ala Ser 
130 135 140 

Gly Gin Gin Trp Gly Glu Ala Asp Leu Ala Arg Met Lys Ala Leu Ser 
145 150 155 160 

Asp He Gly Leu Glu Leu Ser He Thr Gly Gly He Thr Pro Ala Asp 
165 170 175 

Leu Pro Leu Phe Lys Asp He Arg Val Lys Ala Phe He Ala Gly Arg 
180 185 190 

Ala Leu Ala Gly Ala Ala Asn Pro Ala Gin Val Ala Gly Asp Phe His 
195 200 205 



Ala Gin He Asp Ala He Trp Gly Gly Ala Arg Ala 
210 215 220 



<210> 29 
<211> 225 
<212> PRT 
<213> YiaQ-Hi 



<400> 29 

Met Gly Lys Pro Leu Leu Gin He Ala Leu Asp Ala Gin Tyr Leu Glu 
15 10 15 

Thr Ala Leu Val Asp Val Lys Gin He Glu His Asn He Asp He He 
20 25 30 

Glu Val Gly Thr He Leu Ala Cys Ser Glu Gly Met Arg Ala Val Arg 
35 40 45 

He Leu Arg Ala Leu Tyr Pro Asn Gin He Leu Val Cys Asp Leu Lys 
50 55 60 

Thr Thr Asp Ala Gly Ala Thr Leu Ala Lys Met Ala Phe Glu Ala Gly 



65 



70 



75 



80 



Ala Asp Trp Leu Thr Val Ser Ala Ala Ala His Pro Ala Thr Lys Ala 
85 90 95 

Ala Cys Gin Lys Val Ala Glu Glu Phe Asn Lys He Gin Pro Asn Leu 
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100 105 110 

Gly Val Pro Lys Glu He Gin He Glu Leu Tyr Gly Asn Trp Asn Phe 
115 120 125 

Asp Glu Val Lys Asn Trp Leu Gin Leu Gly He Lys Gin Ala He Tyr 
130 135 140 

His Arg Ser Arg Asp Ala Glu Leu Ser Gly Leu Ser Trp Ser Asn Gin 
145 150 155 160 

Asp He Glu Asn He Glu Lys Leu Asp Ser Leu Gly He Glu Leu Ser 
165 I've 175 

lie Thr Gly Gly He Thr Pro Asp Asp Leu His Leu Phe Lys Asn Thr 
180 185 190 

Lys Asn Leu Lys Ala Phe He Ala Gly Arg Ala Leu Val Gly Lys Ser 
195 200 205 ^ 

Gly Arg Glu lie Ala Glu Gin Leu Lys Gin Lys He Gly Gin Phe Trp 
210 215 220 

He 

225 



<210> 30 
<211> 297 
<212> PRT 
<213> YiaR-Ec 



<400> 30 
Met Arg Lys Ser 
1 

Leu Gly He Tyr 
20 

Arg Leu Val Leu 
35 

Val Asp Glu Thr 
50 

Gin Arg Thr Ser 
65 



Thr Leu Ser Gly 
5 

Glu Lys Ala Leu 



Ala Lys Ser Cys 
40 

Asp Glu Arg Leu 
55 

Leu Val Ala Ala 
70 



Glu Val Arg Val 
10 

Ala Lys Asp Leu 

25 

Gly Phe Asp Phe 



Ser Arg Leu Asp 
60 

Met He Glu Thr 
75 



Arg Asn His Gin 
15 

Ser Trp Pro Glu 
30 

Val Glu Met Ser 
45 

Trp Ser Ala Aia 



Gly Val Gly He 
80 
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Pro Ser Met Cys Leu Ser Ala His Arg Arg Phe Pro Phe Gly Ser Arg 
85 90 ' 95 

Asp Glu Ala Val Arg Glu Arg Ala Arg Glu He Met Ser Lys Ala He 
100 105 110 

Arg Leu Ala Arg Asp Leu Gly He Arg Thr He Gin Leu Ala Gly Tyr 
115 120 125 

Asp Val Tyr Tyr Glu Asp His Asp Glu Gly Thr Arg Gin Arg Phe Aia 
130 135 140 

Glu Gly Leu Ala Trp Ala Val Glu Gin Ala Ala Ala Ser Gin Val Met 
145 150 155 160 

Leu Ala Val Glu He Met Asp Thr Ala Phe Met Asn Ser He Ser Lys 
165 170 175 

Trp Lys Lys Trp Asp Glu Met Leu Ala Ser Pro Trp Phe J-hr Val Tyr 
180 185 190 

Pro Asp Val Gly Asn Leu Ser Ala Trp Gly Asn Asp Val Pro Ala Glu 
195 200 205 

Leu Lys Leu Gly He Asp Arg He Ala Ala He His Leu Lys Asp Thr 
210 215 220 

Gin Pro Val Thr Gly Gin Ser Pro Gly Gin Phe Arg Asp Val Pro Phe 
225 230 235 240 

Gly Glu Gly Cys Val Asp Phe Val Gly He Phe Lys Thr Leu His Lys 
245 250 255 

Leu Asn Tyr Arg Gly Ser Phe Leu He Glu Met Trp Thr Glu Lys Ala 
260 265 270 

Lys Glu Pro Val Leu Glu He He Gin Ala Arg Arg Trp He Glu Ala 
275 280 285 



Arg Met Gin Glu Ala Gly Phe He Cys 
290 295 



<210> 31 
<211> 286 
<212> PRT 
<213> YiaR-Hi 
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<400> 31 

Met Lys Lys His Lys He Gly He Tyr Glu Lys Ala Leu Pro Lys Asn 
15 10 15 

He Thr Trp Gin Glu Arg Leu Ser Leu Ala Lys Ala Cys Gly Phe Glu 
20 25 30 

Phe He Glu Met Ser He Asp Glu Ser Asn Asp Arg Leu Ser Arg Leu 
35 40 45 

Asn Trp Thr Lys Ser Glu Arg He Ala Leu His Gin Ser He He Gin 
50 55 60 

Ser Gly He Thr He Pro Ser Met Cys Leu Ser Ala His Arg Arg Phe 
65 70 75 80 

Pro Phe Gly Ser Lys Asp Lys Lys He Arg Gin Lys Ser Phe Glu He 
85 90 95 

Met Glu Lys Ala He Asp Leu Ser Val Asn Leu Gly He Arg Thr He 
100 105 110 

Gin Leu Ala Gly Tyr Asp Val Tyr Tyr Glu Lys Gin Asp Glu Glu Thr 
115 120 125 

He Lys Tyr Phe Gin Glu Gly He Glu Phe Ala Val Thr Leu Ala Ala 
130 135 140 

Ser Ala Gin Val Thr Leu Ala Val Glu He Met Asp Thr Pro Phe Met 
145 150 155 160 

Ser Ser He Ser Arg Trp Lys Lys Trp Asp Thr He He Asn Ser Pro 
165 170 175 

Trp Phe Thr Val Tyr Pro Asp He Gly Asn Leu Ser Ala Trp Asn Asn 
180 185 190 

Asn He Glu Glu Glu Leu Thr Leu Gly He Asp Lys He Ser Ala He 
195 200 205 

His Leu Lys Asp Thr Tyr Pro Val Thr Glu Thr Ser Lys Gly Gin Phe 
210 215 220 

Arg Asp Val Pro Phe Gly Gin Gly Cys Val Asp Phe Val His Phe Phe 
225 230 235 240 

Ser Leu Leu Lys Lys Leu Asn Tyr Arg Gly Ala Phe Leu He Glu Met 
245 250 255 
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Trp Thr Glu Lys Asn Glu Glu Pro Leu Leu Glu He He Gin Ala Arg 
260 265 270 

Lys Trp He Val Gin Gin Met Glu Lys Ala Gly Leu Leu Cys 
275 280 285 



<210> 32 
<211> 231 
<212> PRT 
<213> YiaS-Ec 

<400> 32 

Met Leu Glu Gin Leu Lys Ala Asp Val Leu Ala Ala Asn Leu Ala Leu 
5 10 15 



1 



Pro Ala His His Leu Val Thr Phe Thr Trp Gly Asn Val Ser Ala Val 
20 25 ^30 

Asp Glu Thr Arg Gin Trp Met Val He Lys Pro Ser Gly Val Glu Tyr 
35 40 45 

Asp Val Met Thr Ala Asp Asp Met Val Val Val Glu He Ala Ser Gly 
50 55 60 



Lys Val Val Glu Gly Ser Lys Lys Pro Ser Ser Asp Thr Pro Thr His 
^5 . ^0 75 80 

Leu Ala Leu Tyr Arg Arg Tyr Ala Glu He Gly Gly He Val His Thr 
85 90 95 

His Ser Arg His Ala Thr He Trp Ser Gin Ala Gly Leu Asp Leu Pro 
100 105 110 

Ala Trp Gly Thr Thr His Ala Asp Tyr Phe Tyr Gly Ala He Pro Cys 
115 120 125 

Thr Arg Gin Met Thr Ala Glu Glu He Asn Gly Glu Tyr Glu Tyr Gin 
130 135 140 

Thr Gly Glu Val He He Glu Thr Phe Glu Glu Arg Gly Arg Ser Pro 
145 150 



155 



160 



Ala Gin He Pro Ala Val Leu Val His Ser His Gly Pro Phe Ala Trp 

165 170 175 

Gly Lys Asn Ala Ala Asp Ala Val His Asn Ala Val Val Leu Glu Glu 
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185 190 

Cys Ala Tyr Met Gly Leu Phe Ser Arg Gin Leu Ala Pro Gin Leu Pro 
195 200 205 

Ala Met Gin Asn Glu Leu Leu Asp Lys His Tyr Leu Arg Lys His Gly 
210 215 220 

Ala Asn Ala Tyr Tyr Gly Gin 
225 230 



WO 00/22170 

180 



<210> 33 
<211> 231 
<212> PRT 
<213> YiaS-Hi 

<400> 33 

Met Leu Ala Gin Leu Lys Lys Glu Val Phe Glu Ala Asn J.eu Ala Leu 
15 10 15 

Pro Lys His His Leu Val Thr Phe Thr Trp Gly Asn Val Ser Ala He 
20 25 30 

Asp Arg Glu Lys Asn Leu Val Val He Lys Pro Ser Gly Val Asp Tyr 
35 40 45 

Asp Val Met Thr Glu Asn Asp Met Val Val Val Asp Leu Phe Thr Gly 
50 55 60 

Asn He Val Glu Gly Asn Lys Lys Pro Ser Ser Asp Thr Pro Thr His 
65 70 75 80 

Leu Glu Leu Tyr Arg Gin Phe Pro His He Gly Gly He Val His Thr 
85 90 95 

His Ser Arg His Ala Thr He Trp Ala Gin Ala Gly Leu Asp He He 
100 105 110 

Glu Val Gly Thr Thr His Gly Asp Tyr Phe Tyr Gly Thr He Pro Cys 
115 120 125 

Thr Arg Gin Met Thr Thr Lys Glu He Lys Gly Asn Tyr Glu Leu Glu 
130 135 140 



Thr Gly Lys Val He Val Glu Thr Phe Leu Ser Arg Gly He Glu Pro 
145 150 155 160 
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Asp Asn He Pro Ala Val Leu Val His 
165 

Gly Lys Asp Ala Asn Asn Ala Val His 

180 185 

Val Ala Tyr Met Asn Leu Phe Ser Gin 
195 200 

Pro Met Gin Lys Asp Leu Leu Asp Lys 
210 215 

Gin Asn Ala Tyr Tyr Gly Gin 
225 230 



PCT/US99/23862 

Ser His Gly Pro Phe Ala Trp 
1-70 - 175 

Asn Ala Val Val Leu Glu Glu 
190 

Gin Leu Asn Pro Tyr Leu Ser 
205 

His Tyr Leu Arg Lys His Gly 
220 
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